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ABSTRACT 


In  this  report  a  new  Cramer-Rao  (CR)  type  lower  bound  is  derived  which  takes 
into  account  a  user-specified  constraint  on  the  length  of  the  gradient  of  estimator  bias 
with  respect  to  the  set  of  underlying  parameters.  If  the  parameter  space  is  bounded, 
the  constraint  on  bias  gradient  translates  into  a  constraint  on  the  magnitude  of  the 
bias  itself:  the  bound  reduces  to  the  standard  unbiased  form  of  the  CR  bound  for 
unbiased  estimation.  In  addition  to  its  usefulness  as  a  lower  bound  that  is  insensitive 
to  small  biases  in  the  estimator,  the  rate  of  change  of  the  new  bound  provides 
a  quantitative  bias  "sensitivity  inde.x”  for  the  general  bias-dependent  CR  bound. 
An  analytical  form  for  this  sensitivity  index  is  derived  which  indicates  that  small 
estimator  biases  can  make  the  new  bound  significantly  less  than  the  unbiased  CR 
bound  when  important  but  difficult-to-estimate  nuisance  parameters  exist.  This 
implies  that  the  application  of  the  CR  bound  is  unreliable  for  this  situation  due 
to  severe  bias  sensitivity.  .As  a  practical  illustration  of  these  results,  the  problem 
of  estimating  elements  of  the  2  x  2  covariance  matrix  associated  with  a  pair  of 
independent  identically  distributed  (IID)  zero-mean  Gaussian  random  sequences  is 
presented. 
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1.  INTRODUCTION 


This  report  deals  with  the  problem  of  bounding  the  variance  of  parameter  estimators  under  the 
constraint  of  small  bias.  In  multiple  parameter  estimation  problems,  the  variance  of  the  estimates 
of  a  single  parameter  can  appear  to  violate  the  unbiased  Cramer-Rao  (CR)  lower  bound  due  to 
the  presence  of  extremely  small  biases;  that  is,  the  actual  variance  of  the  estimator  is  lower  than 
that  predicted  by  the  CR  bound  (for  a  particularly  simple  extimple  of  bound  violation  see  Stoica 
and  Moses  [Ij).  This  indicates  that  the  unbiased  CR  bound  may  be  an  unreliable  predictor  of 
performance  even  when  biases  are  otherwise  insignificant.  On  the  other  htmd,  the  ^plication  of 
the  general  CR  bound  for  biased  estimators  depends  on  knowledge  of  the  particular  bias  of  the 
estimator;  in  particular,  it  is  necessary  to  know  the  gradient  of  the  bias  with  respect  to  the  vector 
of  unknown  parameters.  However,  the  precise  evaluation  of  estimator  bias  is  fi-equently  difficult 
and  not  of  direct  interest  when  bias  is  small.  Furthermore,  for  performance  comparisons,  a  useful 
lower  bound  should  apply  to  the  entire  class  of  estimators  with  acceptably  small  bias. 

In  this  report  a  new  CR-type  lower  bound  is  derived  which  takes  into  account  a  user-specified 
constraint  on  the  length  of  the  gradient  of  estimator  bias  with  respect  to  the  vector  of  unknown 
parameters.  Alternatively,  the  bound  takes  into  account  a  constraint  on  the  actual  estimator  bias 
as  the  unknown  parameters  range  over  a  specified  ellipsoid.  This  bound  is  uniform  with  respect  to 
a  special  class  of  biased  estimators;  those  whose  bias  gradient  has  a  length  of  less  than  or  equal  to 
S  <  1.  In  addition  to  its  usefulness  as  a  bias-insensitive  lower  bound,  the  slope  of  the  new  bound 
as  a  function  of  6  provides  a  characterization  of  the  bias  sensitivity  of  the  general  CR  bound  on 
estimator  variance.  For  a  given  estimation  problem,  an  overly  large  magnitude  of  bias  sensitivity 
provides  a  warning  against  use  of  the  unbiased  CR  bound.  If  an  upper  bound  on  the  bias  gradient 
ot  tne  estimator  is  specified,  our  lower  oound  on  estimator  variance  can  subsequently  be  applied. 

The  specific  results  developed  herein  follow. 

1.  A  geometric  point  of  view  provides  some  insight  into  the  behavior  of  the  general  CR 
bound; 

2.  A  functional  minimization  is  performed  to  arrive  at  tne  new  bound  based  on  the 
Fisher  information  matrix; 

3.  Results  of  an  asymptotic  analysis  of  the  new  bound  as  bias  — ►  0  indicate  important 
factors  controlling  bias  sensitivity  of  general  CR  bounds; 

4.  The  asymptotic  analysis  suggests  a  “bias-sensitivity  index,”  which  is  the  slope  of  the 
new  bound  as  a  function  of  the  length  6  of  the  bias  gradient.  This  index  indicates 
the  impact  of  difficult-to-estimate  “nuisance”  parameters  on  the  magnitude  of  the 
general  CR  bound; 

5.  The  form  of  the  new  bound  is  suggestive  of  “superefficient,”  essentially  unbiased 
estimator  structures  which  could  outperform  absolutely  unbiased  estimators  in  the 
sense  of  mean-squared-error; 
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6.  Sensitivity  results  are  obtained  for  estimation  of  the  elements  of  the  2x2  co-v-ariance 
matrix  associated  with  a  pair  of  independent  identically  distributed  (IID)  zero-mean 
Gaussian  random  sequences. 

The  report  is  organized  as  follows.  Section  2.1  gives  the  notation.  Section  2.2  is  a  summary 
of  useful  vector  and  matrix  relations.  Section  3  defines  the  ciass  of  essentially  unbiased  estimators. 
Section  4  is  a  geometric  interpretation  of  the  CR  bound  in  terms  of  its  bias  dependency.  The  new 
bound  is  derived  in  Section  5.  In  Section  6  the  slope  of  the  bound  Ls  derived  and  an  asymptotic 
approximation  to  the  new  bound  is  given.  Section  7  is  a  discussion  of  the  results  and  an  interpre¬ 
tation  of  the  new  bound  in  terms  of  the  joint  “estimability”  of  the  multiple  jiarameters.  Finally, 
Section  8  applies  the  new  bound  to  covariance  estimation  for  a  pair  of  HD  Gaussian  sequences. 
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2.  PRELIMINARIES 


2.1  Notation 


General  notational  conventions  are  as  follows.  If  {(y.  c)  :  r  =  giv)}  is  the  graph  of  a  function, 
then  g  denotes  the  function  and  g(y)  denotes  a  fimctional  evaluation  at  the  point  y.  An  exception 
to  this  convention  occurs  when  g  is  used  to  denote  g{y)  for  compactness  of  notation.  In  general,  an 
uppercase  letter  near  the  end  of  the  alphabet,  e.g.,  A',  denotes  a  random  variable,  random  vector,  or 
random  process  and  the  corresponding  lowercase,  e.g.,  ar,  denotes  its  realization.  An  uppercase  letter 
near  the  beginning  of  the  alphabet,  e.g.,  F,  denotes  a  matrix.  The  ith-jth  element  of  a  matrix  F  is 
denoted  F,j  or  ((F)), j.  An  underbar  denotes  a  column  vector,  e.g.,  d.  and  a  superscript  T  denotes 
the  transpose,  e.g.,  .  For  vectors,  subscripts  index  over  the  elements,  e.g.,  0  =  [^i,.  ■  ■  while 

superscripts  discriminate  between  different  vector  quantities,  e.g.,  0*’  =  [9^ . graidient 

operator  is,  by  convention,  a  row  vector  of  partial  derivatives  Ig^,-  • .  ,  For  convenience. 

def 

when  there  is  no  risk  of  confusion  the  simplified  notation  Vy(6)  =  ^u9(u)\u=^  will  be  used  for  the 
gradient  of  a  scalar  or  vector  valued  function  g(u)  at  a  px>int  u  —  9. 

Some  particularly  useful  definitions: 

•  A'  :  the  generic  observation;  e.g.,  a  set  of  snapshots  of  the  data  outputs  from  multiple 
sensors. 


•  j  ;  a  realization  of  A". 

•  0  :  the  ri-diinensional  parameter  spa'-e. 

•  9  ■  U  '■  parameter  vectors. 

•  I  d  .  the  Euclidean  norm  of  vector  d,  j!d||  =  \ld^d- 

•  '■  the  probability  density  function  of  X  evaluated  at  X  =  x  .  9  =  ■ 

•  9  ~  §.(X)  :  an  estimator  of  9- 

•  niii)  '■  the  mean  vector  Eg (9)  of  9-  where  9  is  the  true  underlying  parameter. 

•  b{9)  ■  the  bias,  m(0)  -  of  9- 

•  cvYg(9)  :  the  n  X  n  covariance  matrix  of  9,  £«i(@  -  ^{9)){9  -  tn(0))^). 

•  var^fdi )  :  the  variance  of  9i . 

•  F(9)  '■  the  n  X  n  Fisher  information  matrix  associated  with  estimators  of  9.  This 
matrix  will  always  be  assumed  to  have  bounded  elements  and  to  be  invertible  with 
a  bounded  inverse  F“'(@)  over  any  domain  of  9  of  interest. 
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(t,  b,  F,  :  tho  Fi;;hcr  information  for  Oi,  the  information  coupling  between  0i  and 
(F,...  iuul  the  (n  -  1)  x  (n  -  1)  Fisher  information  nuitrix  for  02,...,9n-  Note 

(I  I)  ’ 

that  F  =  ■  ,  where  a  is  positive  and  is  invertible  by  assumption. 

b  F, 


•  l{0)  :  the  log-likelihood  function,  In  f{X\9). 

•  mean  log-likelihood  (ambiguity)  function,  Fgopn  f{X;9)]. 

•  b  :  ii  aser-specified  upper  bound  on  the  length  of  the  bias-gradient  vector. 

•  Bb,  (9)  :  the  general  CR  lower  bound  on  var^(0i)  for  estimators  with  bias  bi(9)  (28). 

•  D[B,6)  :  the  new  low'er  bound  on  varfl(0i)  for  estimators  with  bias  h\{9)  such  that 
|!V6iil  <  <!?  (56). 

•  XB{B.b)  :  the  normalized  difference  between  the  unbiased  CR  bound  and  the  new 

•  ;  the  minimizing  bias-gradient  vector  which  characterizes  B{9,6)  (65). 

•  A  ;  a  scaling  constant  determined  by  the  solution  to  the  constraint  equation  on  the 
bias  gradient  (57). 

•  ij  :  the  sensitivity  index  of  the  general  CR  bound,  derived  from  B{9,6). 


2.2  Identities 

Some  vector  and  matrix  identities  to  be  used  in  the  sequel  are  given  here. 
•  Let  .-1  Ix’  <ui  invertible  m  x  m  matrix  which  has  the  partition 


L  -  J 

where  a  Is  a  nonzero  scalar,  cisan  (m  —  l)xl  vector,  and  isan  (m  —  l)x(Tn  —  1) 
invertible  matrix.  The  inverse  of  A  can  be  expressed  in  terms  of  the  partition  elements 
a,  r,  and  A,  ([2],  Tlieorem  8.2.1): 


Ag  c 

—A~^  c - L 


}  —Q^A.  ^ |- 

A,  *  c  *  a—c^  A,  '  c 

c — A~^  +  Aj^cc^A~^  - TT-pn-r 

-a-r^  A,  £  —  c) 


•  Let  A  lx:  ;ui  m  x  rri  invertible  matrix  and  let  [/  and  K  be  rn  x  k  matrices,  'espectively. 
If  the  matrix  \A  UV'^\  is  nonsingular,  the  Sherman-Morrison- Woodbury  identity 
gives  the  i.’iver.se  as  ([.'!],  Section  0.7.4) 

\A  +  I IV '/’]  - '  .4  - '  -  /I-  ‘  U\I  +  V’^'A-  ‘  U] - '  V'^A- '  .  (3) 
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•  For  the  gradient  of  quadratic  forms  and  inner  products  with  respect  to  a  vector,  we 
have  the  following  identities  ([2],  Section  10.8): 

Az  =  2x  A,  (/I  symmetric)  (4) 

=  y'^  .  (5) 

•  If  i4  and  B  are  symmetric  matrices  which  possess  identical  eigenvectors,  then  AB  = 
BA  ([3],  Theorem  4.1.6).  If,  in  addition,  A  and  B  are  positive-definite,  then  AB  is 
positive-definite  ([4],  p.  350,  Exercise  23). 

•  Let  A,  B,  and  C  be  matrices  and  assume  B  is  symmetric  and  positive-definite.  If  A 
is  a  scalar,  a  singular  value  decomposition  of  B  establishes  the  following  for  positive 
integer  k: 

-I-  A5}-*Ci  =  -kx^AB[I  +  .  (6) 

dA 
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3.  ESSENTIALLY  UNBIASED  ESTIMATORS 


This  section  deals  with  the  following  general  setup  of  our  estimation  problem.  Let  an  obser¬ 
vation  A'  have  the  probability  density  function  where 

^  =  (^1 . (7, 

is  a  real,  nonrandom  parameter  vector  residing  in  an  open  subset  0  =  0i  x  •  •  •  x  0^  of  the  n- 
dimensional  space  IR".  Suitably  modified,  the  theory  herein  can  be  applied  to  more  general  0  (e,g,, 
subsets  of  IR"  which  are  defined  bv  differentiable  functional  inequality  and  equality  constraints)  by 
replacing  the  Fisher  information  matri.x  F(£)  (21),  used  throughout  the  report,  with  a  reduced-rank 
Fisher  matrix  [5],  We  use  the  conventional  notation  for  expectation  of  a  random  variable  Z,  with 
respect  to  f{x\0), 

Z{i)f(x-,0)dx  ,  (8) 

where  supp/(«;£)  =  {.t  :  /(x;£)  >  0}  is  the  support  of  the  probability  density  function  (PDF)  / 
for  fixed  6. 

Let  £]  be  an  estimator  of  £[  with  mean  Ep(Oi)  =  mi(ff)  and  bias 


(9) 


The  bias  is  said  to  be  "globally  removable"  if  (»i  is  a  constant  independent  of  £  and  “loc.  ly 
removable  over  a  region  £  €  F"  if  ('i  is  constant  over  the  region  P.  By  convention,  when  we  rt  '♦  r 
to  a  “region  V  in  0"  we  mean  a  nonempty,  open,  connected  subset  of  0.  The  estimator  is  sa 
to  be  globally  (locally)  unbiased  if  the  bias  is  globally  (locally)  removable.  In  the  sequel  we  wi. 
address  the  problem  of  lower  bounding  the  variance  of  £],  given  that  £]  is  “essentially  unbiased" 
in  the  sense  that  for  a  prespecified  constant  ^  6  [0,1] 


dOi(d)  00, (ff) 

00,  ’  "  ’  Offn 


<  0^ 


V£€  0 


(10) 


Bias  gradients  contained  in  the  constraint  set  {d  :  d^d  <  S^}  for  all  values  of  9  are  called  admissible 
bias  gradients.  Note,  however,  that  vector  functions  d(9)  exist  which  satisfy  the  constraint  in  (10) 
for  all  £  but  are  not  valid  gradient  functions,  and  therefore  are  not  admissible  bias  gradients.  The 
importance  of  the  bias  gradient  in  ( 10)  in  lower  bounding  the  variance  of  9,  will  be  seen  in  Equation 
(28). 


The  restriction  6  €  [0,1]  in  (10)  is  sufficiently  general,  since  for  6  >  1  the  gradient  can  be 

taken  as  V6i  =  (-1,0 . 0],  This  bias  gradient  corresponds  to  the  trivial  estimator  0-^  —  constant, 

which  has  zero  variance.  Observe  that  for  Inequality  (10)  to  be  well  defined  the  bias  must  be 
differentiable.  Ibragimov  and  Has’minskii  [6],  Chapter  1,  Lemma  7.2,  shows  that  under  essentially 
the  same  regularity  conditions  which  guarantee  the  existence  of  the  Fisher  information,  the  bias 
exists  and  is  differentiable  regardless  of  the  estimator  0].  Hence,  the  differentiability  property  is 
only  dependent  on  the  imderlying  distribution  of  the  observations,  not  on  the  particular  form  of 
the  estimator.  Differentiability  is  therefore  not  a  restrictive  assumption  in  characterizing  classes  of 
biased  estimators  for  a  fjarticular  estimation  problem. 

More  significantly,  note  that  Inequality  (10)  is  a  constraint  on  the  rate  of  change  of  the  bias 
and  not  on  the  bias  itself.  However,  the  definition  of  an  acceptable  range  of  the  bias  is  typically 
more  natural  than  the  definition  of  an  acceptable  range  of  the  bias  gradient;  this  issue  will  be 
discussed  presently.  For  sufficiently  small  6,  (10)  implies  that  in  a  practical  sense  the  bias  is  locally 
removable  over  any  prespecified  finite  region;  therefore  the  estimator  is  locally  unbiased.  On  the 
other  hand,  for  bounded  parameters  (10)  can  be  related  to  global  unbiasedness. 

The  bound  presented  here  is  applicable  if,  for  example,  a  user  is  interested  in  a  lower  bound 
on  estimator  variance  which  applies  to  a  class  of  estimators  permitted  to  have  small,  perhaps 
"acceptable,"  biases  over  a  partimeter  range  of  interest.  As  mentioned  previously,  it  is  generally 
more  natural  to  specify  an  acceptable  range  of  biases  than  an  acceptable  range  of  bias  gradients. 

We  will  now  show  how  the  former  can  be  converted  to  the  latter.  Assume  that  the  user  specifies 
an  ellipsoid  of  parameters  centered  at  some  parameter  0  =  1/  and  a  maximal  allowable  variation  in 
the  bias  over  the  ellipsoid.  This  requirement  is  stated  mathematically  as 

<1.  ve,r  e  {«:  l|cliag(A-,)(u-i/)||  <  1}  ,  (11) 

where  diag(  A', )  Ls  a  diagonal  matrix  of  positive  constants  and  the  user-specified  quantities  A'l , . . . ,  K„ 
and  ■>  determine  the  ellipsoid  and  the  maximal  allowable  bieis  variation,  respectively.  The  ellipsoid 
of  (11)  also  reflects  the  user’s  choice  of  units  to  represent  each  of  the  parameters. 

To  standardize  the  analysis,  it  is  convenient  to  normalize  the  ellipsoid  to  a  sphere  via  a  coor¬ 
dinate  transformation  (scaling)  of  the  parameters.  This  coordinate  transformation  is  implemented 
by  premultiplying  parameter  vectors  in  the  original  coordinates  the  diagonal  matrix  diag(A'i) 
in  (11).  The  reader  may  verify  that  the  result  of  this  transformation  is  to  replace  the  quantities 
[@.@"<k:  hi(^).^i(^°),7]  (which  are  parameterized  in  the  original  coordinates)  in  (11)  with  the  quan¬ 
tities  idiag(  A,-  ‘  )^,  diag(  A  -  ‘  )^°.diag(  A,'  •  )i/.  Af  ‘  h,  (0),  Af '  6i  (@<’),  Af  *  7]  (where  {0,0°,!^,  61  (0),  h  (r ),  7] 
are  parame.erized  in  the  new  coordinates).  It  is  then  seen  that  (11)  becomes  equivalent  to  a  bias 
constraint  over  a  displaced  unit  sphere  in  the  new  coordinates 


\bi  (»)  -  <7  (r  )|  <  7.  V@,r  6  {u  :  ||«  -  dl  <  1} 


(12) 
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Throughout  the  rest  of  the  report  it  is  assumed  that  the  user  ellipse  has  been  normalized  to  the 
standard  spherical  region  (12). 

The  following  proposition  translates  the  user  constraint  on  the  bias  (12)  to  the  constraint  on 
the  bias  gradient  (10). 

Proposition  1  Let  bi{0)  be  a  differentiable  scalar  (bias)  function,  with  (bias)  gradient  V6i,  over 
the  spherical  region  {u  :  lilt  —  1^11  ^  !}•  Then  the  set  of  n  dimensional  vectors 

T.  =  {d  :  ildll  <  7/2}  (13) 

defines  the  largest  region  in  R"  containing  gradients  V6i  for  which  f>i(«)  satisfies  the  requirement 
(12),  in  the  sense  that: 

1.  If  V6i  €  ZXj,  d  €  ,  then  bi{»)  satisfies  the  requirement  (12). 

2.  If  ly  is  such  that  'D.y  C  ly  (strictly  proper  subset)  then  TI  contains  a  vector  Vbj , 
d  €  5",  which  is  the  gradient  of  a  function  6i(«)  that  violates  the  requirement  (12). 

Proof  of  Proposition  1.  Because  b\{0)  is  differentiable  and  the  sphere  5”  is  a  convex  set,  we 
have  from  Rudin  [7],  Theorem  9.19, 

!6i(d)-6i(r)|  <  A/||d-dl,  ,  (14) 

where  A/  is  an  upper  bound  on  ||V6i||  over  <S".  Because  the  maximal  distance  between  any  two 
wctors  6,0°  in  the  unit  sphere  5"  Is  2,  the  inequality  (14)  can  be  replaced  by 

\bi  (6)  -  hi (e°)\  <  2M,  \/e,o°  e  5”  .  (is) 

Now,  if  V6i  e  Vy,  then  ||V6i||  <  7/2  so  that,  using  M  —  7/2  in  (15), 

ibi(d)-<n(m  <7,  vd,r€5"  ,  (16) 

which  proves  Assertion  1  of  the  proposition.  However,  if  contains  Vj  as  a  proper  subset,  a 
constant  vector  deV  exists  such  that  |(d||  >  7/2.  Let  the  gradient  vector  V61  be  defined  as  the 
constant  d^.  Because  V61  =  is  independent  of  0,  we  have  bi{0)  —  ^0  +  C  for  some  constant 
C.  Consider  the  two  vectors 


These  vectors  are  on  the  boundary  of  tlie  spliere  S”  because  ||§  —  ^“||  =  2;  furthermore, 


ii'iid)  -  hitni 


id^li  -  r]i 


I  =  llf/||2 


>  (7/2)2  =  7 


(18) 


(19) 


so  that  the  requirement  (12)  is  violated.  Tliis  establishes  Assertion  2  and  completes  the  proof  of 
Proposition  1. 

Proposition  1  assorts  that  to  satisfy  the  bias  requirement  (12).  the  constraint  ||Vhi||  <  S, 
with  ^  —  ■)/2,  is  the  weakest  possible  gradient  constraint  which  satisfies  that  requirement  and  is 
independent  of  It  must  be  emphasized  that  before  the  proposition  can  be  used  the  user  ellipsoid 
{0  :  ||diag(A',  )[^  -  r']||  <  1}  has  to  be  transformed  to  a  sphere  via  the  coordinate  transformation 
described  in  the  paragraph  following  requirement  (11). 

It  is  important  to  note  that  Proposition  1  does  not  address  the  existence  of  estimators  having 
the  bias  function  6i  prescribed  by  Assertion  2  and  violating  the  requirement  (12).  Specifically, 
in  proving  .Assertion  2  we  produced  a  function  b]  and  its  gradient  V6i.  w-hich  violate  constraints 
on  function  variation  and  constraints  on  gradient  magnitude,  respectively.  VVhile  this  shows  a 
certain  topological  etiuivalence  between  these  two  types  of  constraints,  there  is  no  guarantee  that 
the  function  6i  is  the  bia.s  ££[^j  -  ^  of  any  physically  realizable  estimator  0{X). 


10 


4.  INTERPRETATIONS  OF  THE  CR  BOUND 


Define  the  vector  of  estimators  9  =  {0i,. . .  ,6n)^  of  parameters  in  the  vector  6.  Assume  that 
the  PDF  of  the  observations  is  “regular”  ([6],  Chapter  1,  Section  7)  and  that  Eel&l]  is  bounded, 
i  =  1, . . .  ,n;  then  the  gradient  of  the  mean  Tni{6)  —  exists,  i  —  1, . . .  ,n,  and  is  continuous, 

and  the  covariance  matrix  of  0  satisfies  the  matrix  CR  lower  bound  Bb{9)  ([fi],  Chapter  1,  Theorem 
7.3): 

cov9(0)  >  B6(^)  =  [Vm(^)]F-»(0[Vm(0)f  .  (20) 

In  (20).  F  =  F(6)  is  the  nonsingular  n  x  n  Fisher  information  matrix 

F(0)‘^'Fe[V„ln/(A;u)U=^f[V„ln/(A;«)U=^]  ,  (21) 

and  Vm  Vm(6}  is  an  n  x  n  matrix  whose  rows  are  the  gradient  vectors  Vm,,  i  =  1, . . ,  ,n.  Under 
additional  assumptions  ([6],  Chapter  1,  Lemma  8.1)  the  Fisher  information  matrix  is  equivalent  to 
the  Hessian,  or  “curvature,”  matrix  of  the  mean  of  ln/(A;u): 

F(e)  =  -E0ylV^]nf{X-,u)\,=e  =  -VlV^Ee\nf{X;u)\u^  (22) 

If  the  wctor  estimator  6  is  locally  unbiased,  then  Vm(£)  =  I  and  the  lower  bound  (20)  becomes 
the  unbiased  CR  bound 

o3ve{6)  >  F'~^{6)  .  (23) 

Comparison  between  the  right-hand  sides  of  the  general  CR  bound  (20)  and  the  unbiased  CR 
bound  (23)  suggests  defining  the  biased  Fisher  information  matrix  F^ 

F6(e)  =  [Vm(@)]-^F(0)[Vm(0)]-»  ,  (24) 

where  it  has  been  assumed  that  the  matrix  Vm(^)  is  invertible.  With  the  definition  (24)  the  general 
CR  bound  (20)  becomes 

cove(0)  >  F(,~‘  (25) 


II 


Because  vare(^i)  is  the  (1,1)  element  of  cx>vg(ff),  the  matrix  bound  (20)  gives  the  following 
bound  on  the  variance  of  ; 

)  >  ej [Vm{e)]F'\9)[Vrn(efe^  ,  (26) 

where  is  the  unit  (column)  vector 

Si  =  [l,0,...,0j^  (27) 

Note  that,  concerning  the  CR  bound  on  only  the  first  row  of  Vm  is  important.  The  following, 
denoted  the  general  CR  bound  in  the  sequel,  is  equivalent  to  '26); 

^’^Te{e\)  >  =  [^rn,{6)]F-^{0)[Vm,ie)f 

=  [ei  +  V6,(0)Y^-'(@)[ei  +  V6i(0)^]  ,  (28) 

where,  in  the  second  equality  of  (28),  the  relation  (9)  has  been  used. 

Observe  that  a  lower  bound  on  the  mean-squared  error  (MSB)  of  ,  MSEe(6\ )  Ee\d\  ~6i]^ , 
can  be  obtained  fi-om  the  variance  lower  bound  (28)  by  using  the  relation  MSEe{0i)  =  vare(^i)  + 

my 


MSEejei)>BbAO)  +  b^,{e)  .  (29) 

.Any  lower  bound  on  the  variance  is  also  a  lower  bound  on  the  MSB,  as  the  second  term  on  the 
right  of  (29)  is  non-negative. 

For  locally  unbiased  estimators  of  ^i,  the  gradient  vector  Vmi{9)  is  the  unit  row  vector  Cj 
ctnd  the  CR  bound  is  the  (1,1)  element  of  the  inverse  Fisher  matrix 

vare(ei)  >  efF“*(0)ei  ,  (30) 

which  will  be  called  the  unbiased  form  of  the  CR  bound  on  9\ . 

The  following  interpretations  are  helpful  in  understanding  the  influence  of  bias  on  the  CR 
bound. 
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4.1  CR  Bounds  and  Unbiased  Estimation  of  the  Mean 


The  general  matrix  CR  bound  (20)  on  the  covariance  of  a  biased  estimator  6  oi  B  can  be 
equivalently  interpreted  as  an  unbiased  CR  bound,  just  as  in  (23),  on  the  covariance  of  @  viewed 
as  a  “differentially  unbiased”  estimator  of  iis  mean  m(^). 

Fix  a  point  0"  €  0.  An  estimator  6  with  mean  m[B)  is  defined  to  be  differentially  unbiased 
at  the  point  0  =  0°  if  Vm(B°)  —  /,  where  I  is  the  n  x  n  identity  matrix.  Note  that,  under  the 
assumption  of  differentiability  of  rn(^),  a  locally  unbiased  estimator  is  necessarily  differentially 
unbiased.  As  m(@)  >s  a  differentiable  function  of  6, 


miB)  -  Tn(B°)  =  Vm(r)(@  -  r)  +  o(|i@  -  rii)  (31) 

so  that  Tn{B)  is  a  locally  linear  transformation  in  the  neighborhood  of  B°-  Assuming  the  matrix 
^?2t(^°)  to  be  invertible,  this  permits  a  local  reparameterization  of  0  by  the  values  y,  taken  on  by 
the  linear  approximation  to  the  function  m{B)  over  this  neighborhood; 

z/  =  u{B)  =  '^Tn(B°m  -  B°)  +  m(Bl  ,  (32) 


and 


B  ^  B(i^}  *  [vm(r)i“Vi^  -  m(r))  +  r 

Using  (33)  and  the  chain  rule  of  vector  differentiation. 


(33) 


Vi,rn(g(t;)) 

V.rn([Vrn(r)l"^£  -  m(B°))  +  B°) 

V„m(u)l„mvm(»<’)|-l(i/-m(e‘’))+f°) 

Vm(g)[Vm(r)]'* 


(34) 


When  B  =  B°^  the  last  line  of  the  above  is  the  identity  matrix  so  that  ^  is  a  differentially  unbiased 
estimator  of  the  transformed  parameter  u  at  the  point  i/  =  u{B°)  —  ni(B°). 

Because  0  is  a  differentially  unbiased  estimator  of  i/  =  Tn(B°)i  the  CR  bound  on  the  covariance 
of  @  at  ^  is  given  by  (23)  with  Fisher  matrix  F'(@") 

F'iBn  =  f;solV.ln/(X;0(«/))|,=^(s.)]^[Vj,b/(X;0(i/))|,=^(5O)i  .  (35) 
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Use  of  the  relation  (33)  and  application  of  the  chain  rule  yields 

V,]nf(X;e(u))l^te'>)  =  V.  In /(.Y;  [Vm(r)]~'(<i  -  m(r))  +  r))L=ni(eo) 

=  V^hifiX-,u)\^^j  {Vrnim'  •  (36) 

Substitution  of  the  above  into  (35)  yields  the  form 

f'(r)  =  [vm(r)]-^ 

=  nm  (37) 

Hence,  the  Fisher  matrix  F'(6)  (37)  for  (differentially)  unbiased  estimation  of  Tn(0)  is  identical  to 
the  biased  Fisher  matrix  Fb(S)  (24)  for  biased  estimation  of  0. 

W'e  can  therefore  conclude  that  there  are  two  equivalent  w.’ays  of  interpreting  the  biased  Fisher 
information,  alternately  the  CR  bound  (20):  a  measure  of  the  accuracy  with  which  the  mean  Tn[0) 
of  6  can  be  estimated  without  bias,  and  a  measure  of  the  accuracy  with  which  the  parameter  9  can 
be  estimated  with  bias. 

4.2  CR  Bounds  and  Sensitivity  of  the  Ambiguity  Function 

Define  the  log-likelihood  function  I 

l(6)  =  \nf{X;9)  ,  (38) 

and  the  ambiguity  function 

({y,^)  =  £'£[ln/(A';u)]  .  (39) 

For  a  fixed  value  =  @°  of  the  parameter,  the  ambiguity  function  is  simply  the  mean  log-likelihood 
function.  Although  the  arguments  u  and  9  reside  in  the  same  space  0,  it  is  useful  to  distinguish 
between  the  search  parameter  u  and  the  true  parameter  9- 

Two  important  properties  of  the  ambiguity  function  are  1.  l(u,9°)  has  a  global  maximum 
over  uat  u  =  9°  and  consequently,  if  V„r(u,0‘’)|„=^‘>  exists,  l(u,9°)  has  a  stationary  point  at  u  = 
Vui(u,9°)\u=e°  —  0,  and  2.  the  sharpness  of  this  maximum  is  related  to  the  Fisher  information 
matrix  F(9°)-  Due  to  the  latter  property,  the  general  CR  bound  (26)  can  be  investigated  through 
a  study  of  the  smoothness  of  the  ambiguity  function. 

To  see  1.  as  defined  in  the  preceding  paragraph,  observe  that  ioi  9  =  ^  and  arbitrary  u  €  ©, 
l{9,9)  -  l{u,9)  =  Ee[ln/(A;0)-ln/(A;u)] 
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(40) 


=  /  f{x;6)[\nf{x;6)-]nf{x;u)]dx 

Jsnpp  /(•;^) 

where  supp/(*;^)  =  {2:  :  f(x\6)  >  0}.  Using  the  elementary  inequality  In2/<J/  —  1,  2/>0  ([8j, 
Theorem  150),  we  have  the  following: 


-  Uu,o) 


I  /(x;e)ln 

J  supp/(«;@) 


>  -  /  /(x;@) 

J  supp 


/(x;n) 

/(x;u) 


dx 


dx 


Jix-A) 

f  f(x;d)dx  -  (  f{x\u)dx 

*/supp  ^supp/(»;^) 

I  -P 


(41) 


where  p  €  [0,1].  Hence,  since  1  -  p  >  0  in  (41),  l{6,6)  >  l(u,0),  Vu;  thus,  l(u,0)  has  a  global 
maximum  ai  u  =  0. 

For  2,.  observe  that  the  incremental  variation  A^/,  in  l{u,0°),  which  is  produced  by  an 
incremental  diange  in  u  from  u  =  to  u  =  +  An,  is  given  by  the  Taylor  formula 


Aj  =  l{0°  +  Au,t)-l{r,0°) 

=  V„r(u,r)U=e»An  +  ^AM’"[v^Vj(u,r)U=«»]  Am  +  €  (42) 

In  (42),  €  is  a  remainder  that  falls  off  to  zero  as  o(|lAujp).  Use  the  fact  that  V„r(u,0°)|u=^o  =  0 
and  identify  the  Fisher  matrix  F  (22)  in  the  quadratic  form  on  the  right-hand  side  of  Equation 
(42)  to  obtain 


Aj^r  =  -  ^  Au^F(r  )Att  +  €  .  (43) 

From  (43)  it  is  clear  that  a  small  variation,  Au,  in  the  search  parameter  u  produces  a  quadratic 
variation  in  l(u,0°),  with  F  playing  the  role  of  a  gain  or  sensitivity  matrix.  Let  the  difference,  Au, 
between  the  search  parameter  and  the  fixed  true  parameter  0°  vary  over  some  differential  region 
defined  with  respect  to  the  sttmdard  orthonormal  basis  for  R.  If  the  Fisher  information  is  a  high- 
gain  matrix,  e.g.,  F  has  large  eigenvalues,  then  the  ambiguity  function  will  have  a  large  variation 
in  the  corresponding  eigenvector  directions.  In  view  of  the  dependence  of  the  imbiased  CR  bound 
(30)  on  F,  this  suggests  that  the  sharpness  of  the  peak  of  the  ambiguity  function  in  the  standard 
coordinates  is  directly  related  to  the  CR  bound  on  unbiased  estimators  of  0. 

Now,  let  Vm(0°)  be  the  gradient  matrix  (Jacobian)  of  the  mean  rn{0°)  of  a  biased  estimator 
and  assume  that  Vm{0°)  is  invertible.  Using  the  identity  [Vm(0‘’)][Vm(0°)]“^  =  /  in  (43),  we 


15 


obtain,  by  regrouping  terms. 


=  -^AM^[Vm(r)f  [VTTi(r)r^F(r)[Vzn(r)rMVm(r)]AM  +  € 

=  [Vrn(@°)A«f  {lVm(rr^F(0“)[Vrn(r)r^  }  [Vm(r)Ayl  +  e 

=  -^AijFb{r)Au  +  e  ,  (44) 

where  Fb  is  the  biased  Fisher  information  (24)  at  and 

Ai/  =  Vm{0°)Au  (45) 


is  the  differential  parameter  variation  Au  =  o  —  y.(6°)  in  the  new  coordinates  induced  by  the  local 
transformation  (32),  u  =  Vm{6°)Au  +  Tn{6°).  The  relation  (44)  is  similar  to  the  relation  (43)  in 
that  they  both  relate  variations  in  the  search  parameter,  Au  and  At/,  respectively,  to  variations  in 
the  ambiguity  function  I  %ia  a  gain  matrix,  F(9°)  and  Fb{9°),  respectively.  If  we  fix  a  differential 
region  of  variation  for  Au  and  At/  in  (43),  F{0‘’)  is  the  gain  associated  with  variation  Au  over 
this  differential  region  in  the  standard  coordinates,  while  in  (44)  Fb{6°)  is  the  gain  associated 
with  variation  At/  over  this  differential  region  in  the  transformed  coordinates  (45).  In  light  of  the 
dependence  on  Fb  of  the  general  biased-estimation  CR  bound  (25),  this  suggests  that  for  biased 
estimators  the  sharpness  of  the  peak  of  the  ambiguity  function  at  Au  =  0  in  the  transformed 
coordinates  (which  give  a  locally  linear  approximation  to  the  mean  function)  is  directly  related 
to  the  general  CR  bound,  which,  as  noted  previously,  is  the  bound  which  applies  to  unbiased 
estimation  of  the  mean  function  of  the  estimator.  Comparing  this  observation  to  that  made  after 
(43),  we  see  that  in  either  the  standard  or  transformed  parameter  coordinates  the  sharpness  of 
the  peak  of  the  ambiguity  function  is  directly  related  to  the  CR  bound  that  applies  to  unbiased 
estimation  ir.  these  coordinates.  The  relation  between  the  general  CR  bound  and  the  variation  of 
I  is  explained  in  greater  detail  in  the  following  paragraphs. 

Using  (42)  and  (45),  the  variation  of  the  ambiguity’  function  as  Au  varies  in  the  transformed 
coordinates  of  (45)  can  be  explicitly  given  in  terms  of  the  gradient  matrix  Vm(0°), 

Aj  =  l{e°  +  Au)  -  1(9°)  (46) 

-  1(0°  +  [Vm(0°)]-‘ Atf)  -  1(9°) 

In  (46)  lVm(0'’)]“' Ai/  is  the  differential  in  the  standard  coordinates  that  is  induced  the  differ¬ 
ential  Au  in  the  transformed  coordinates. 

For  purposes  of  illustration,  consider  Figure  1,  denoting  a  (spherical)  volume  element  {Au  : 
l|Ai/l|  =  A}  in  the  transformed  coordinates  Au  =  lVm(@°)]Au,  and  Figure  2,  denoting  the  induced 
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Av, 


Figurr  1.  ,-1  spherical  wlume  element  {Ai_'  :  liAi/|!  =  A}  m  f/if  transformed  coordinates 


(ellipsoidal)  volume  element  {Au  :  lllVml^^HAt/ii  =  A}  in  the  standard  coordinates.  Figure  2 
corresponds  to  the  case 

1  -  a  b 
0  1 

where  a  1  and  6  >  0.  In  Figure  2  the  angle  of  the  principal  axis  can  be  shown  (through 
considerable  algebra)  to  be 

V  (1  -  a)b  j 

and  the  positive  parameters  r  and  are  given  by 

(1  -  a)^  -t-  6^  +  I 
^  ~  2 
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Figurt  i,'  The  induced  differential  eolurne  element  {Au  :  !l|Vrn(^‘’)iAu|,  =  A}  m  the 
.standard  ('rtordinate.s  is  an  ellip.stnd 


<r 


(iff 


(1 


Referring  t«  Figures  1  and  2.  let  At/  vary  over  the  radius-A  n-dimensional  sphere  {  ^  :  lUi!  < 
Af  (Figure  1).  For  unbiased  estimation  =  /  and  the  variation  in  the  argument  Au  = 

*  At/  =  At/,  of  A„/  (46)  is  over  the  same  n-dimensional  sphere.  For  biased  estimation 
Vniiff’)  is  not  an  identity  matrix  and  the  variation  in  the  argument  Ay  is  over  the  n-dimensional 
ellipsoid  {r  :  ll[Vin(^‘’)]r ||  <  A}  (Figure  2).  Under  a  set  of  bias  constraints  of  the  type  (10)  on  each 
of  the  rows  of  this  ellipsoid  can  only  be  a  small  perturbation  of  a  sphere.  Nonetheless,  if 

the  ambiguity  function  has  an  unstable  diaracteristic  locally  in  the  standard  coordinates,  such  as 
a  sharp  ridge,  then  the  differential  variation,  A^l,  of  I  over  the  ellipsoid  in  Ay  can  be  made  much 
larger  than  the  differential  variation  of  I  over  the  sphere  in  At/  by  judicious  choice  of  Vm(9°)  (see 
Figure  3).  The  variation  of  I  over  the  sphere  of  radius  A  in  Ai/  can  be  used  to  bound  from  below 
the  general  CR  bound  on  variance  (26)  for  estimators  of  9i  b>'  using  the  following  fact. 
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Figure  3.  The  constant  nontours  of  a  hypothetical  ambiguity  function  l(u.6°),  os’er  u  €  & 
for  fixed  9  =  6°,  and  the  superimposed  mduced  differential  volume  dements  of  Figures  I 
and  2.  Because  the  dlipsoidat  region  mcludes  a  greater  number  of  contour  bnes,  the 
maximum  variation  of  I  over  the  dlipsoidal  region  is  greater  than  the  variation  over  the 
spherical  region. 


FAC^: 


£.  l|Ai/lr<A2 

when  B(n  is  the  CR  bound  (28)  and  e  is  o(|[Ai^|P)  =  o(A^). 


(47) 


Hence,  if  the  variation  of  I  is  small,  i.e,  I  has  a  broad  peak,  in  the  transformed  coordinates, 
then  (47)  asserts  that  B(,j  must  be  large,  implying  poor  variance  performance  of  estimators  of  0i 
with  mean  gradient  matrix  Vrn{d).  More  important,  if  a  bias-induced  coordinate  transformation  on 
the  search  parameter  Ao  =  [VTn(^)]Ay  can  be  found  for  which  the  local  variation  of  the  ambiguity 
function  over  the  ellipsoid  {Au  :  |l[Vm(^)]A2il|  <  A}  is  large,  then  a  reduction  in  the  CR  bound 
may  be  possible. 


Proof  of  Fact.  Fix  a  parameter  value  6.  Define  Ar;: 


1 


(48) 


where  F(,  ^  is  a  square- root  factor  of  the  (positive-definite)  matrix  Fj,  ^  [see  (24)]. 

It  is  showm  that  Ai^  (48)  is  a  vector  contained  in  the  radius- A  sphere  {  1  :  lUII  <  A},  because 
the  norm  squared  of  Ao  is 


iiAi^ir  = 


I  I 

ir  5. 


—  4  0  O  - 

^A" 

=  A^ 

f-K 

1  A 

(49) 


/(^  +  [Vm(@)]-*Ai/)  -  1(0)  =  -^At/^F6Ai/-l-o(A2) 
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^6 


=  A  Fb  +o{A2) 

^  L\/ef^5  ‘ei  J  L\/ef^;  . 

=  445K^«(A^. 

2  e{^6  ei 

=  T^—  +o(A^)  (50) 

2  el  ^6  Si 

Now  (50)  gives  the  value  for  A^/  evaluated  at  a  point  Ai^  (48)  contained  in  the  radius- A  sphere. 
Hence,  the  maximum  magnitude  of  over  the  radius-A  sphere  must  be  at  least  as  great  as  the 
magnitude  of  the  right-hand  side  of  (50).  This,  and  the  elementary  inequedity  |i  —  2/|  >  |xl  —  |y|, 
gives  the  bound 


Multiplying  this  inequality  through  by  B;,, /maxy^^lp,;^!  |Aufl  gives  the  statement  of  the  fact  (47). 
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5.  A  NEW  CR  BOUND  FOR  ESSENTIALLY  UNBIASED  ESTIMATORS 


Here  we  obtain  a  new  lower  bound  which  is  ^plicable  for  essentially  unbiased  estimators, 
i.e.,  those  whose  bias  gradient  is  small.  Assume  that  the  bias  i>i  is  such  that  l|V6i(0)|p  <  for  all 
^  €  ©.  The  starting  point  for  the  lower  bound  is  the  obvious  inequality  [see  (28)] 

var9(0i )  >  Bfe,  (0)  >  min  BbAO)  >  min  [ej  +  [cj  +  d]  (52) 

This  gives  the  lower  bound  valid  for  estimators  0,  which  satisfy  the  bias-gradient  constraint  (10) 
vare(0i)  >  B($,6)  ,  (53) 


where 


B(0,6)  —  niin  [ci-t-dFE 


(54) 


Note  that  by  definition  B{6,0)  is  just  the  unbiased  CR  bound.  The  bound  (53)  is  independent  of 
the  particular  bias,  6i,  of  the  estimator  as  long  as  the  bias  constraint  is  satisfied.  The  normalized 
difference 


AS(g,6) 


B(^,0)  -  Bits) 

BitO) 


(55) 


is  the  potential  improvement  achievable  over  an  absolutely  unbiased  estimator,  i.e.,  AB(0,6)  niea- 
sures  the  bias  sensitivity  of  the  unbiased  CR  bound.  The  sensitivity  is  a  real  number  between  0 
and  1,  and  increased  sensitivity  corresponds  to  a  larger  difference. 

The  new  bound  is  specified  by  the  solution  of  the  minimization  problem  (54).  We  find  the 
solution  in  the  following  theorem  and  corollary. 

Theorem  1  Let  t  Se  an  essentially  unbiased  estimator  of  di  with  bias  bi(t  which  satisfies  the 
constraint  (10)  ||Ve6i|p  <  ^  <  1.  The  lower  bound  B{tS)  (53)  is  equal  to 

B(0,(5)  -  R(g,0)  -  -ef[/ -f  AF]-‘F-*ei  ,  (56) 


xvhere  A  is  given  by  the  unique  non-negative  solution  of  the  following  equation  involving  the  mono¬ 
tone  decreasing  convex  function  g{X)  €  [0, 1|; 
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5(A)  =^e]’[/  +  AJ']-2ci  =6^  ,  A>0 


(57) 


Corollary  1  An  alternative  form  for  the  bound  (56)  is 

B{U)^[ei+d^XF-^\e,+d^„]  ,  (58) 

where  is  the  vector  which  minimizes  the  quadratic  form  [cy  +  d]^F”^[£i  +d]  in  (54)  over  the 
constraint  set  {d  :  ||d||  <  6} 

C„  =  -er[/  +  AFr‘  .  (59) 

In  (59),  A  is  the  solution  to  (57). 

Observe  that  the  vector  (59)  of  Corollary  1  is  related  to  the  function  5(A)  (57)  of 
Theorem  1  by  the  identity 

il,nirmn  =  ej\l  +  =  5(A)  .  (60) 

Because  5(A)  =  <5^  in  Theorem  1,  is  on  the  boundary  of  the  bias-gradient  constraint  set. 

Proof  of  Theorem  1.  The  objective  is  to  show  that  the  right-hand  side  of  (54)  is  equal  to  the 
right-hand  side  of  (56): 

=  S(5,0) -A^^ -ef[/-^AF]-^F-‘ei 

where  the  general  CR  bound  (28)  has  been  denoted  by  the  quadratic  form  (?(V6i)  to  make  evident 
the  quadratic  dependence  on  the  bias  gradient  V5i 


Q{d)  =  \ei+d?'l^^'ki+d\  .  (62) 

We  take  a  geometric  point  of  view  which  is  «isily  formalized  by  using  standard  Lagrange 
multiplier  theory.  Specifically,  Q(d)  is  a  oonvex-upwards  paraboloid  centered  at  coordinates  — gj  = 
-[1,0, . . .  ,0]^,  and  (5(-gi)  =  0.  The  problem  is  to  find  the  vector  d  =  d^^  within  the  radius  6 
ball,  d^d  <  for  which  Q{d)  is  a  minimum.  Observe  that  by  the  assumption  ^  <  1  the  absolute 
minimum  of  Q  is  not  attained  within  the  radius  6  ball.  Therefore,  as  there  are  no  local  minima, 
the  minimizing  vector  dm»n  must  be  on  the  boundary  of  the  ball.  By  inspection  of  the  constant 
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contours  of  Q{d)  (Figure  4),  the  minimum  is  attained  when  the  contour  of  Q{d)  is  tangent  to  the 
boundary  of  the  radius  6  ball,  i.e.,  the  gradients  are  collinear  and  of  opposite  sign; 

VdQ(d)  =  -AVd  d^d  (63) 

for  some  A  >  0.  Using  the  rules  of  vector  differentiation  for  quadratic  forms,  (4)  and  (5),  (63)  is 
equivalent  to 

F-^[ei+d]  =  -Xd  .  (64) 


Figure  4-  Plot  of  the  constant  contours  of  Q{d)  and  the  domain  of  the  constraint  d^d  < 
.  The  minimum  of  Q  is  achieved  at  the  point  indicated  by  the  vector  which  is 
normal  to  the  tangent  plane  between  Q(d)  and  d^d  =  6* . 
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Solving  (64)  for  d  =  gives 


d  =  d^„  =  -[/  +  AF]-iei  .  (65) 

Note  that  the  matrix  inverse  [/  +  XF]~^  exists,  since  A  is  non-negative.  The  scaling  constant  A  has 
to  be  chosen  so  that  is  on  the  boundary  of  the  radius  6  ball  d :  ||d|p  <  S^,  from  which  we  obtain 
equation  (57)  for  A, 


g(A)  dr^d^„  =  ej{l  +  AFj-^ei  =  6^  .  (66) 

Observe  that  5(0)  =  1,  g(oo)  =  0,  and  because  F  is  positive-definite,  5(A)  >  0  for  A  >  0.  Application 
of  the  differentiation  identity  (6)  to  5(A)  ©ves  ^'(A)  =  ~2ejF[I  +  AF]“^gi.  Due  to  positive- 
definitmess  of  F,  |p'(A)|  <  00  so  that  g  is  continuous  at  all  points  A  >  0.  Furthermore,  ance 
[/  -I-  AF]~^  is  symmetric  positive-definite  with  identical  eigenvectors  as  F,  F[/  +  AF]“^  is  positive- 
definite  (see  Section  2.2)  for  A  >  0.  Hence,  g  is  monotone  decreasing  over  A  >  0  with  values 
g{X)  €  [0, 1].  In  a  similar  manner,  the  second  derivative  gf'  can  be  shown  to  be  positive,  which 
establishes  that  5  is  a  convex  (upwards)  function. 

It  remains  to  show  that  the  minimizing  solution  d„^„  gives  the  bound  (56).  This  is  established 
b\'  substitution  of  d^j^n  (65)  into  Q{d)  (62): 

B{0,6)  =  Q(d^„) 

=  ki  +  F'*le,  d^„] 

=  efF-'gi -gf[/-l-AF]-*F-^ei 

-  g[  [f-‘ [7  +  AF]-‘ -[/-(- AF]-iF-i[/  +  AF]-i]gi  .  (67) 

In  (67)  we  have  made  use  of  the  property  of  symmetry  of  the  Fisher  matrix.  The  first  and  second 
terms  on  the  right-hand  ade  of  (67)  are  simply  the  unbiased  CR  bound  (30),  B{9,0),  and  the  final 
term  on  the  right-hand  side  of  (56),  respectively.  Using  the  constraint  (66),  the  third  term  in  (67) 
is  seen  to  be  equal  to 

ej  [7^‘  [-^  +  -[t  +  AF]-‘F-*[/  +  AF]-’]  gj 

=  gf  [7  +  AFJ-^  [[7  +  AFjF-*  -  F-^]  [7  +  AF]-^gi 

=  g?’[/  + AF]-MA7][7-l-AF]-*gi 

=  Agf[7-t-AF]-2ei  -  A^^  .  (68) 

This  establishes  the  theorem. 
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As  stated,  Theorem  1  gives  a  bound  B(9,6)  for  which  the  (1,1)  element  of  the  inverses 
of  the  n  X  71  matrices  F[/  +  AF]  and  [/  +  AF]^  must  be  computed.  These  calculations  can  be 
implemented  by  sequential  partitioning  [9].  A  more  explicit  version  of  will  be  of  interest  for 

the  approximations  of  Section  6. 

Let  the  Fisher  information  matrix  F  (21)  have  the  partition 


F  = 


a 

c 


(69) 


where  a  is  a  scalar,  c  is  a  (n  —  1)  x  1  vector,  and  F,  is  a  (n  —  1)  x  (ti  —  1)  submatrix.  Define  the 
inverse  matrices 


F-i  <fef 


a  f 

0  r 


(70) 


(/  +  AF]-‘ 


a. 

Ax  Ta. 


(71) 


Using  the  partitioned-matrix  inverse  identity  (2),  a,  oa,  F,  0,  and  0^  have  the  following  expressions 
in  terms  of  the  elements  of  F  (69): 

a  =  (a-c^F-*c)-'  (72) 

0  =  -qF^'c 

r  -  F-i  +  qF-^cc^F-i 

QA  =  (1  +  Aa  -  a2/[/  +  AF,]-*c)-‘  (73) 

0^  =  -Aqa[/  +  AF.1-1c  . 

The  quantity  Fa  ivill  not  be  explicitly  needed  and  is  omitted. 

In  (69),  c  represents  the  information  coupling  between  estimates  of  9i  and  estimates  of  the 
other  parameters  d2,  .  0n-  This  can  be  seen  from  the  fact  that  for  unbiased  estimation  the  CR 
bound  on  6i  (30)  is  (F“')ii  =  a  =  (a  —  c^F“^c)“*,  which  is  identical  to  in  the  case  c  =  0. 
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Theorem  2  In  terms  of  the  partitioned  Fisher  information  matrix  (69),  the  bound  B(0,6)  and  the 
associated  constraint  equation  (57)  of  Theorem  1  have  the  expressions 

B(e,6)  8(6,0)  -  X6^  -  axa  -  ,  (74) 

and  X  is  the  unique  positive  solution  of 

(l+Xa-  A2c^[/  +  AF,]-ic)2  ^  ^ 

Furthermore,  the  minimizing  vector  (65)  of  Corollary  1  is  given  by 

rfm.n  =  =aAl-l,Ac^(/  +  AF,]-M’'  •  (76) 

In  (74),  (75),  and  (76)  a,  ax,  0,  and  0^  are  the  quantities  given  in  (72)  and  (73). 

Proof  of  Theorem.  Substitution  of  the  inverses  F~^  (70)  and  [/  +  AF]“'  (71)  into  the  expressions 
for  B{6,6)  (56)  of  Theorem  1  gives 


ax  I  Q 


8(6,6)  =  B(6,0)-X6^-ei\  "M  “  Ui 

■  L  ^A  Ta  J  [  ^  r  J  - 

=  B(m  -  -  [aA,^[](Q,^Y 

=  B(6,0)  -  X6^  -  axa  -  f^0  ,  (77) 

which  is  expression  (74).  Similarly,  substitution  of  (71)  into  the  expression  for  (65)  of  Corol¬ 
lary  1  gives 


d^,,  -  -lI  +  XF]-^c, 


ax  a 
L  Ta 


e.  =  [-aA,-^ir 


=  qa[-1,  Ac^[/  +  AF.)->]^  ,  (78) 

where,  in  the  last  line,  the  identity  for  (73)  has  been  used. 

Squaring  the  matrix  [7  -I-  AF]“*  (71),  we  find  that  the  constraunt  equation  (57)  of  Theorem  1 
has  the  following  expression: 
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(79) 


g{X)  =  eJ[I  +  \F]-^e_,  -  =  6^ 


Application  of  the  identities  (73)  to  the  previous  equation  gives  the  equivalent  expression  for  g. 


g(X)  =  al  +  X'^alc^ll  +  XFs]  ‘c 
=  al[l  + X'^J  [I  +  XF^-^c] 

1  +  A  V[/  +  AFj-^c 

(l  +  Aa-A2c^[/  +  AF,]-ic)2 


(80) 


which  is  equivalent  to  the  expression  in  (75).  This  completes  the  proof  of  Theorem  2. 

The  bound  in  Theorem  2  does  not  have  an  analytic  form  in  general  because  the  solution  A  to 
(75)  is  not  given  explicitly.  Numerical  polynomial  root-finding  techniques  can  be  used  to  solve  (75), 
or  equivalently  (57),  for  A.  In  particular,  as  the  Fisher  matrix  F,  is  positive-definite  and  symmetric 
it  has  the  representation  F^  =  ,  where  Q  is  an  orthogonal  matrix  with  columns  q.  and  $  is 

a  diagonal  matrix  with  diagonal  elements  <*>,  >  0,  r  =  l,...,n.  Hence,  we  have 


eJ[I  +  XF]-^e, 


ej[l  +  XQ^Q'^]~\, 
el  (QII  +  X^jQ^y' e, 


elQlI  +  X^r'^Q^ti 


(81) 


Therefore,  inserting  the  exprc.ssion  into  the  left-hand  side  of  (57)  and  multiplying  both  sides  by 
nr=i[l  +  we  obtain  a  polynomiaJ  equation  for  A: 


L 

<=i 


1 

(I  +  A<ii,)2 


n 


j=i 


(82) 


Subtraction  of  the  right-hand  side  of  (82)  from  the  left-hand  side  gives  a  polynomial  in  A  of  degree 
2n  which  must  be  set  to  zero. 

For  the  special  case  of  zero  information  coupling,  c  =  0,  an  explicit  expression  for  A  can  be 
found  and  the  bound  B(G,6)  of  Theorem  2  has  an  explicit  form.  From  the  definitions  of  a,  ax,  0 
and  (72)  and  (73),  c  =  Q  implies  that  q  =  a~'  =  5(2,0),  q,\  =  (1  +  Aa)“\  and  /J  =  =  Q. 

Furthermore,  from  (75) 


29 


(83) 


<^'  =  <7(A)  = 


1 

(1  +  A«)2 


=  a 


2 

A 


and,  consequently, 

A  =  a-’{^-'-l)  .  (84) 

Hence,  using  (83)  in  (76) 

dr.,.  =  [-aA,Q^]^  =  [-^.Q^r  =  -/ic,  .  (85) 

and,  from  (58),  tlie  bound  is 

BiiJ)  =  k^  i  +  dr,,..]  (86) 

=  ki  -  -  /if,]  =  (1  -  Sfe[F~^ej 

=  (1  -  fifB(d.O)  =  (1  - 

Observe  that  zero  information  coupling  implies  two  important  facts:  small  bias  gradients  have 
very  little  effect  on  the  general  CR  bound,  as  the  difference  (55)  =  1  -  (1  -  ^)^  2;  0;  and 

coupling  of  the  bias,  6] ,  of  to  the  other  parameters  ^2^  •  •  •  -  is  not  likely  to  significantly  reduce 
the  CR  bound,  as  the  CR  bound's  minimizing  vector  dmi.  (85)  makes  the  bias  of  ffi  independent 
of  the  other  parameters. 

In  the  following  sections  we  will  assume  that  c  5^  0. 
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6.  A  BIAS-SENSITIVITY  INDEX  AND  A  SMALL-(5  APPROXIMATION 


In  the  previous  section  the  form  of  a  CR-type  bound  B{6,S)  was  given  in  terms  of  an  un¬ 
determined  multiplier  A  given  by  the  solution  of  <7(A)  =  (~5).  For  the  case  of  zero  information 

coupling,  5(A)  has  a  simple  form,  giving  an  explicit  solution  for  A  and  B{6,6)  (86).  Otherwise,  an 
exact  analytic  form  for  Blfi.fi)  is  difficult  to  obtain  and  the  solution  A  to  (75)  might,  for  example, 
be  calculated  using  numerical  polynomial  root-finding  techniques  applied  to  (82).  In  this  section 
an  o(^)  approximation  is  developed  for  B(fi,6)  which  converges  to  the  true  solution  as  6  approaches 
zero.  Associated  with  the  approximation  is  an  approximate  minimizing  bias-gradient  vector 
which  achieves  the  o(^)  approximation  to  B(fi,fi). 

While  the  appro.ximations  offer  little  computational  advantage  relative  to  the  implementa¬ 
tion  of  the  exact  computation  indicated  in  Theorems  1  and  2,  the  approximate  analytical  forms 
provide  some  insight  into  the  important  factors  underlying  the  bias  sensitivity  of  the  CR  bound. 
In  particular,  the  bias  sensitivity  of  the  CR  bound  is  characterized  by  the  slope  of  B(fi,6),  at 
<5  =  0.  A  large  slope  implies  that  a  small  amount  of  bias  can  substantially  decrease  the  nominally 
unbiased  CR  bound,  cories[)onding  to  high  sensitivity.  In  the  sequel,  this  slope  will  be  related  to 
a  bias-sensitivity  index. 

For  convenience,  formula  (75)  is  repeated  here: 


= 


1  +  A^c^f/  +  XF,]-^c 
(1  +  An  -  A-’c^[/+  A^J-'c)^ 


(87) 


The  idea  behind  the  derivation  of  the  o{fi)  approximation  follows.  Theorem  1  established  that  5(A) 
is  convex  and  monotone  decreasing  over  A  >  0,  5(0)  =  1  and  5(00)  =  0.  Therefore,  tor  a  sufficiently 
small  value  of  fi,  the  solution  A  to  5(A)  =  is  sufficiently  large  so  that  simultaneously 


A/[/-|- AF,]-'ci^c^f7'c  and  aV[/  + Afs]-^csj/f,-2c  .  (88) 


If  (88)  holds,  the  constraint  to  be  satisfied  (87)  becomes  the  simpler  equation 


1  + 


:i+  A[n-c^F,->c])2 


(89) 


from  which  the  solution  A  is  simply  computed  by  taking  the  square  root  of  both  sides  of  (89)  and 
solving  for  A.  This  solution  can  be  plugged  back  into  (65)  and  (56)  to  obtain  approximations  to 
and  B(fi.fi). 

The  following  proposition  jiuts  precise  asymptotic  conditions  on  the  solution  to  the  constraint 
equation  (87)  to  guarantee  the  validity  of  the  approximation  (89). 
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Proposition  2  Let  A*  be  given  by  the  non-negative  solution  of  (89) 


(90) 


and  assume  c^Q.  Corresponding  to  A*,  define  the  vector 

=  -[/  +  .  (91) 

With  these  definitions.  A*  approximates  the  actual  solution  A  of  g{\)  =  ^  ,  A  >  0  (75),  in  the 
sense  that 

1-  0  <  g(y)  <  so  that  does  not  violate  the  constraint  (10)  on  the  bias  gradient. 
(Recall,  from  (60),  that  g(X')  = 

2.  X6  — .  A* (5  =  1  +  c^F^^c  +  0{6)  as  6 0. 

.4s  a  consequence  of  2.,  X6  =0(1)  and  also  j  =  0(6). 


To  prove  Proposition  2  we  will  use  the  following: 


Lemma  1  Let  c^O  and  define  the  positive  quantity  q; 


def  I 

q  =  irun  ^ 

|£^/T‘£ 

(92) 

[c^FF\'  c^FF^cJ 

oc  as  6  — ► 

0.  Furthermore,  the  following  bounds  are  valid  for  all  X  >  0; 

1  -  <  Ac’’[/  +  AF,]-'c<c’'F7*c  ; 

(93) 

1  -  +  AF,]-2c  <  c^F-^c 

(94) 

A  weaker  set  of  bounds  is  obtained  by  replacement  of  q  in  (93)  and  (94 )  by  the  minimum  eigenvalue 
A^*„  of  the  matrix  F,.  Furthermore,  from  (93)  and  (94),  Xc^{I  +  AFgJ'^c  =  c^Ff^c-i-  0(j)  and 
A2c^[/  +  AF,]'2c  =  c^F-2c  +  0(1). 
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Proof  of  Lemma  1.  Hccall  from  Theorem  1  tliat  ff  is  a  continuous,  monotonically  decreasing 
function  with  lim  \_oo  =  0.  Tins  implies  that  the  inverse  function  is  continuous  monoton¬ 
ically  decreasing  with  lim,s_o  therefore,  if  r/  is  fixed  and  nonzero,  for  sufficiently  small 

S  the  quantity  Ag  can  be  made  arbitrarily  large.  Hence,  Aij  oc  as  <5  — •  0  as  claimed. 

The  right-hand  inequalities  in  (93)  and  (9-1)  follow  from  the  inequality  for  a  positive-definite 
matrix  .4  and  arbitrary  vector  t: 

(95) 


for  any  integer  L  >  0.  This  can  be  proven  via  an  eigen-decomposition  of  A.  The  left-hand  inequal¬ 
ities  in  (93)  and  (9-1)  are  established  by  ap|)lication  of  the  Sherman-.Morrison- Woodbury  identity 
[Equation  (3)]; 


[/■f 


_  if -If/  ^  1 

A  ^  A  *  ^  A  ‘  '  A  * 

Af.J-'Tr' 


(96) 


•Application  of  (95)  and  (9G)  to  c^[/  -I-  Afj]~'c  gives  directly 


= 

v 

-c^[/  +  AF,1-'F3-'c 

'u: 

II 

'c 

-c^f;^[/  + af,]-"f; 

Al 

'c 

•c 

fi  . 

V  >'c'Fr'cJ 

(97) 


.Application  of  (95)  and  (96)  to  c^[/  A  AfjJ'^r  yields 


aV[/  + Af,]-''c 


aV[/  + AT,]-'[/-I-  AT,]-'c 

"  ;^[/  +  AF.l-'f.-'Hif-,-'  -  i[/+  AF,]-'Fr']c 


=  -  2J[I  +  AF.j-'F-^c  +  c^[/  -h  AF,]-^F,-2c 

>  c^F,-^c  -  2c^[/  +  AF,]-‘F,-2c 

=  c^F-^c  -  2c^  F7'[y  +  AF,]->F,-’c 


r^Ff^c  - 

-  5  -  5  _ 
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(98) 


2  c^F-^c 


In  obtaining  the  inequalities  (97)  and  (98)  we  have  used  the  fact  that  [7  +  AF,]  \  F^  and  F,  ^ 
are  fxtsitive-definite  matrices  with  identical  eigen-decompositions;  hence  (see  Section  2.2),  these 
matrices  commute  and  [7  -t-  AFa]“^7^“^  is  positive-definite.  By  the  definition  (92),  l/q  >  %  *  i ~ 

?  '  j  £ 

and  1/g  >  »  so  that  the  right-hand  sides  of  the  inequalities  (97)  and  (98)  are  underbounded 

c‘  Fg  c 

by  the  right-hand  sides  of  the  inequalities  (93)  and  (94)  in  the  statement  of  the  lemma. 

Recall  the  variational  inequality  ([3],  Theorem  4.2.2)  for  any  compatible  vector  z  and  s\'m- 
metric  matrix  .4: 


(99) 


where  A^,^  is  the  minimum  eigenvalue  of  A.  Apply  (99)  to  the  lower  inequality  in  (97)  along  with 
the  definition  z  F~^c: 


,r. 


>  c^77’c|1- 


Xz-^F,z 
1 


AA, 


F, 


Likewise,  the  definition  z  -  F,  ^ c  and  (98)  give 


(100) 


J-  r-2. 


2_cJF^\ 

Ac^Ff'J 


> 


(101) 


This  establishes  the  lemma. 

Proof  of  Proposition  2.  Application  of  the  inequalities  in  the  lemma  to  the  expression  for 
<7(A)  (87): 


9(A)  = 


i  +  A2c^[7  +  AF,)-2c 
(1  +  Aa  -  A2c^[7  +  AF,j-»c)2 


evaluated  at  A  =  A*  gives 


(102) 
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(103) 


1  +  -  2(A-f/)' 


[1  +  A’(a  -  V(  1  -  (Av/)  '  ))j 


:;  <  <7(A‘)  < 


l  +  c^F- 


[l  +  A-(fl-c^/r'£)]2 


= 


The  riglit-hand  inequality  in  (103)  establishes  statement  1,  in  the  statement  of  the  proposition; 
(/(A*)  <  S^.  Because  the  lower  bound  in  { 103)  approaches  the  upper  bound  in  (103)  as  A’q  — >  oo. 
p(A*)  is  forced  to  6^  as  A'ry  —  oc,  or  equivalently,  by  the  definition  of  A*  (90),  as  6  — *  0.  To 
establish  statement  2.,  recall  that  </(A)  =  S~  (75)  and  consider  the  following; 


\6  =  A^/^ 


=  A 


1  +  Ay  [/  +  AT.j-V 
1  +  An  -  \^cT\l  +  AF,]-’c) 


+  A2r7'[/  +  AF,]-F 


i  +  Ac^[/+AF,)-'<: 

Now,  by  Lemma  1,  (104)  becomes 

Jl+rTFr‘r  +  0{\) 


= 


V _ 

n  -c^'FF'!:+0(i) 


\/ 1  +  r^'  r;  ■£  1 

- fTtri  +  *^(7) 

a  —  fs  C  ^ 


Because  A  — ■  :>c  as  —  0,  and  identifying  n  (72),  we  obtain  the  limit 


(104) 


(105) 


lim  At'  =  oy  1  4-  /,  .  (106) 

That  lim^— 0  A”<^  is  identically  the  right-hand  side  of  ( 106)  follows  directly  from  the  form  of  A"  (90) 
and  the  identity  (72). 

We  next  derive  an  explicit  exiiression  for  the  slope  of  (56)  at  ^  =  0.  This  expression 

will  be  used  to  develop  an  o{f)  aj)proximation  to 


Theorem  3  The  derivatives  of  lh(  bound  B(d.b)  and  of  the  normalized  difference  between  the 
unbiased  CR  bound  and  this  bound.  AD{d.b)  =  ^ given  by 


dD(e.f>] 

df> 


s=u  =  -2F(^.0)\/r+7^  .  and  =  2y/TT7^ 
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(107) 


where  rj  is  the  bias-sensitivity  index  defined  by 


rf  ^  ||F-^c||2  =  c'^F-^c  .  (108) 

Proof  of  Theorem  3.  The  existence  of  the  derivatives  will  follow  from  the  existence  of  limi_o  , 

which  is  derived  in  the  following  expressions. 

For  convenience  we  repeat  (74): 


B{e,6)  =  B{e,0)  -XS"^  -axa-  ,  (109) 

where 

Q  =  (a -c^F-^c)-^  =  S(0,O)  (110) 

0  =  -ctFf'^c 

and 

qa  =  (1  +  Aa  -  A2/[/ +  XF,]-*c)-‘  (111) 

3^  =  -Aqa[/  +  AF.]-*c 


The  use  of  the  identities  for  a,  qa,  0,  and  0y  (110)  and  (111),  gives  the  following  equation 
for  the  difference  5(0,0)  —  B{d,6): 


5(0,0)  -  5(0,6) 


A6^  +  oao  +  ^0 

X6^  +  qaq  +  Aqac^[/  +  AFj,]~^F“^cq 


A6  +  q(1  +  Ac^[J  +  AF,]-*F-*c)^ 

d 


6 


(112) 


so  that 


5(0,0)  -  5(0,6) 
6 


=  A6  +q(1  +Ac’’[/  +  AF,]-‘i7‘c)^ 

d 


Next  we  develop  the  following  facts: 


•  FYom  (93)  of  Lemma  1  and  the  forms  of  ax  (111)  and  a  (110), 


6 


1 

1  +  Aa  -  A2c^[/  + AF.]-1c 


(113) 


(114) 
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1 


l  +  A(a-c^F,-*c)  +  AO(i) 

.  _ 

1  +  ^  +  0(1)  \sj 


^G) 


Q 

1 


f  +  0{6) 

•  A  completely  analogous  argument  as  used  in  proving  (93)  of  Lemma  1  establishes 
Ac^[/  + Af,l-i/7'c  =  c^F-^c+Oi\) 

A 

=  c^Fr^c^O(6)  (115) 

Substitute  relations  (114)  and  (115)  into  the  right-hand  side  of  (113)  to  obtain 

B{8,0)  -  B{0,6) 

6 

=  A<5  +  a  (l  +  ^ 

Recalling  the  result  of  Proposition  2, 


(116) 


+  0(6) 


A6  — ►  a\/ 1  +  c^Fj  ,  as  6  — ►  0  , 


(117) 


take  the  limit  of  (116)  as  6  —►  0  to  obtain 


dB(e,6)^  B{e,0)  -  B{9,6) 

- li — 1^=0  =  - 1 - 

do  0  6 


=  gy/l  +  crF:h  »(H- 


=  2ay/ 1  -f  c^F,  .  (118) 

Because  a  =  B(0,O),  the  6rst  identity  of  (107)  is  established.  The  additional  observation  that 


dAB{e,6)  d 
di  ~  dS 


B(0,O)  -  B{0,6) 


B(e,0) 


dB{e,6)  1 

d6  B(e,0) 


(119) 


establishes  Theorem  3. 
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Theorem  3  can  be  used  to  approximate  the  form  of  the  bound  B{0,6)  up  to  o(6)  accuracy;  in 
particular,  B(6,6)  ~  B{0,Q)  +  |^=o  for  sufficiently  small  6.  We  have  the  following: 


Theorem  4  The  lower  bound  B(S,6)  (56)  on  the  variance  of  6i  has  the  representation 


B{0,6)  =  B'{0,6)+o{6) 


(120) 


where 


B*{i,6)  -  B{0,O)  (^1  -  26yJl  +  c^Fr^^ 


(121) 


Define  the  vector  ; 


V^i  +  c^Fr^c 


Jr-iiT 


(122) 


ts  an  o(S)  approximation  to  the  minimizing  vector  of  B(0,S)  in  the  sense  that 


e,  +  +  d"„]  =  B{0_.8)  +  o{8) 


(123) 


Proof  of  Theorem  4.  The  relation  (120)  is  just  a  consequence  of  the  form  of  the  derivative  (107)  of 
B(0,6)  at  8  =  0,  given  in  Theorem  3,  and  the  Ihylor  expansion 


B(0,8)  =  B(0,O)+8^^^k^o+o(8) 


(124) 


=  5(^,0)  -  2B{0,O)8tJ  1  +  cTFT^c  +  o{8) 


To  establish  the  second  part  of  the  theorem,  (123)  must  be  shown  to  hold.  For  notational 
convenience,  dehne 


def 
Qoo  "" 


y/i  +  c^FT^c 


(125) 
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so  that  Th®  substitution  of  d^„  into  the  left-hand  side  of  (123)  and 

identification  of  the  inverse  Fisher  matrix  (70)  ©ves 


—  [1  Ckooj^ooC  Fg 


-F.-I  “ 

0  T 


[1  O^CXDl^OO^ 


.Tr-liT 


=  (1  -  a^fa  +  2qoo(1  -  Oook^  F,  +  q^c^F,  ^TF, 


Now  recall  the  definitions  (72)  for  a,  0  and  F: 

Q  =  {a-c^FT^c)-^  =8(6,0) 


0  =  -aF^  *c 


r  =  f;^  ^aF;^cJF;^ 

Substitution  of  (127)  into  (126)  gives 

[ei+d™„fF-i[ei+d; 


=  a(l  -  aoo?  -  2aQoo(l  -  aoo)c^F, 

+  aF,-'ffi^F7')Fr'c 

=  a[(l  -  aoo)2  -  2aoo(l  -  aoo)c^/7^£  + 


+a^/F-3c 


=  a[l  -  Qoo  -  OooC^F,  ^cf  +  q^/F,  ^c 

=  q[1  -  Qoo(1  +  £^i7^£)l^  +  ol>c^Fr\ 


Finally,  recalling  the  definition  (125)  of  a^, 
kx+C,.]'^F-^[ei+d: 


=  q[1  - 


h  +  cTFr^c 


-  B(0,O)[1  -  b\l  1  -f  c^Fr^£f  +  o(i5) 
=  B*{e,6)  +  o(S) 

=  8(6,6) +  0(6) 
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where  in  (129)  we  have  used  the  facts  a  =  (70),  and  B(6,6)  =  B'(6,6)  +  o(6)  (120).  This 

finishes  the  proof. 

The  following  corollary  to  Theorem  4  will  be  needed  for  the  sequel. 


Corollary  2  If  for  a  given  point  Of  the  Fisher  matrix  F{0)  is  invertible  at  0  =  0°  and  its  elements 
are  uniformly  continuous  over  an  open  neighborhood  U  of  Of,  then  an  open  neighborhood  V  C  U  of 
0°  exists  such  that  in  (120)  jo(i)  converges  to  zero  uniformly  over  ^  £  V  as  6  0. 


Proof  of  Corollary  2.  Denote  by  ||.4l|  the  norm  of  the  square  matrix  A  ([3],  Section  5.6)  and  define 
E(0)  =  F(0)  -  F(0f).  Because  the  elements  of  F(0)  are  uniformly  continuous  over  a  neighborhood 
U  of  0°,  ||£’(£)||  converges  uniformly  to  zero  as  0  —  Of,  and  since  F{0f)  is  invertible,  an  open 
neighborhood  IP  c  V  of  0°  exists  sucli  that  F(0)  =  F(0^}+E(0}  is  invertible  over  0  £  W.  Therefore, 
using  the  inequality  WF-^O)  -  F-'(r  )||  <  l|C-‘(£)|l  ■  l|F-'(r)||  ■  |1F(^)1|  ([3],  p.  341,  Exercise  13) 
F~^{0)  converges  to  F~^(0^)  uniformly  over  the  neighborhood  IP  of  0°.  Hence,  F{0)  and  F~^(0} 
are  both  uniformly  continuous  over  the  neighborhood  IP  of  Now  recall  the  definition  (57)  of 
the  function  g{X)  =  gs(A):  ge(^}  +  Since  7+  XF(0)  is  invertible  and  uniformly 

continuous  over  the  neighborhood  U  of  Of,  a  similar  argument  establishes  that  59(A)  is  uniformly 
continuous  in  6  over  a  neighborhood  IP  of  Of,  where  without  loss  of  generality  we  can  take  W'  =  W. 
It  can  also  be  verified  that  for  all  i  >  0  the  function  B‘{0,S)  (121)  is  uniformly  continuous  in  0 
over  the  neighborhood  H'  of  Of.  Define  /(^.A)  =  59(A)  -  where  ■  From  the  uniform 

continuity  of  59(A)  it  follows  tiiat  /(£,A)  is  uniformly  continuous  in  ^  6  IR'*’  x  IP.  Furthermore, 
applying  the  identity  (6)  to  /(^,  A),  the  derivative  V\/(^,  A)  =  -2^ F{0)[1  +  AF($)]~^e)  is  nonzero 
and  continuous  in  A  for  A  >  0.  Define  =  \b,0‘'^^ .  By  Theorem  1,  for  any  ^  >  0  a  unique  point 
A°  exists  such  that  f{ff,\°)  =  0.  We  can  now  apply  the  implicit  function  theorem  ([10],  Chapter 
4.  Theorem  15.1)  to  assert  that  an  open  neighborhood  V  C  IP  e.xists  such  that  the  solution 
A  =  to  the  equation  59(A)  =  6^  is  uniformly  continuous  in  0_  over  ^  £  P.  Therefore,  in 

view  of  the  functional  form  (56),  the  bound  B{0,{i)  is  uniformly  continuous  in  0  over  0£V.  The 
remainder  term  o(b)  in  (120)  is  thus  equal  to  the  difference  B(0,6)  —  B*(£, ^)  of  two  uniformly 
continuous  functions;  therefore  |o(^)  converges  to  zero  uniformly  over  0  £  V''.  This  establishes  the 
corollary. 
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7.  DISCUSSION 


In  this  section  issues  related  to  the  bound  of  Theorems  1  and  2  will  be  briefly  discussed. 
Some  general  issues  are  of  importance:  Is  the  bound  of  Theorem  1  achievable  with  any  preictical 
estimator?  What  does  the  bound  (56)  imply  about  the  inherent  performance  limitations  of  unbiased 
estimators  in  the  presence  of  nuisance  parameters? 


7.1  Achievability  of  New  Bound 

If  for  all  6  in  a  region  D  C  Q  an  unbiased  estimator  achieves  the  (unbiased)  CR  bound, 
on  MSE  (recall  that  MSE  equals  variance  for  unbiased  estimators),  then  the  estimator  is 
called  efficient  over  the  region.  An  estimator  that  by  virtue  of  its  bias  has  MSE  which  is  less  than  or 
equal  to  B(0,0)  for  all  6  in  a  regie  i  D  C  Q,  and  strictly  less  than  B{6,0)  for  at  least  one  6eD,is 
called  superefficient  over  D.  Assume  that  £)  is  a  finite  rectangular  region  D  =  Di  x  ■  •  •  x  ,  where 
Dfc  is  an  open  interval,  k  —  1, . . .  ,n,  and  also  assume  that  F{0)  satisfies  the  uniform  continuity 
properties  over  U  =  D  assumed  in  the  Corollary  to  Theorem  4.  While  achievement  of  the  bound 
B(0,S)  is  not  necessary  for  superefficiency,  if  for  a  sufficiently  small  value  of  S  (to  be  specified 
below)  the  variance  of  an  essentially  unbiased  estimator  0i  cichieves  B(0,6)  for  all  0  in  an  open 
region  D,  then  an  open  region  V  C  D  exists  such  that  is  a  superefficient  estimator  over  9  £  V. 
To  be  specific,  because  the  bias  gradient  of  6i  satisfies  the  condition  (10),  for  all  9,6°  €  D 


Me)-bx{e°)\ 

=  |6l(^)  -  6i(0?,62,-  ••  ,0n)  +  6i(^,02,...  ,0n)  -  6j(0?,02,.  .  .  ,0„) 

+h{9\,^,...,9n) - h{n^...,ei-M  +  b,{9°,,...,9i_M-m°)\ 

<  |f>l(«)-«>lW,^2,-..,0n)|  +  |6j(^,02,....en)-6l(^,e? . 0n)| 

+  -  +  \bi{9\,...,e°„_M~bim\ 

a6i(ui,02,...,en),..  ,  .  ,  d6i(e?,u2,f>3, ...,»«) 


=  l/ 


rV2 

-dUi\  +  \ 


+  ••■  + 


l/ 


dOn 


du„ 


dU2 


-dU2\ 


1 

*=1 


-  6M 


(130) 
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where  M  is  a  positive  constant  independent  of  specific  values  6,0^°  E  D: 


M  max  |6>i  -  ^  |  (131) 

i=i 

As  D  is  finite,  max |  <  oo  and  M  is  finite.  Assume  without  loss  of  generality  that  a  point 
0°  E  D  exists  such  that  hi  {6°)  =  0  (if  no  such  point  exists,  pick  an  arbitrary  ^  E  D  and  redefine 
^1  to  be  01  —  61(0")).  Assume  vare(0i)  =  B{0,6)  for  all  0  £  D.  While  it  follows  firom  Theorem  2 
that  for  any  h  >  0,  B(0,S)  <  S(0,O),  V0,  we  need  to  show  that  MSEg(0i)  <  B(0,O),  where  strict 
inequality  holds  for  at  least  one  point  0.  The  form  (120)  of  B(0,6)  in  Theorem  4  gives  for  all  0  €  £> 

MSEg(0i)  =  S(0,h)+ft?(0)  (132) 

=  B(0,O)  (^1  -  +  0(6)  +  1^(0) 

Therefore,  subtracting  5(0,0)  fi"om  both  sides  of  (132)  and  using  the  bound  (130), 

MSE0(0i)  -  B(0,O}  =  -0[25(0,O)v^l  +  c^Fr^c-  io(0)]  +1^(0)  (133) 

<  -6{2B(0,O)i/l  +  c'^Fr^c  -  io(h)]  + 

=  -0[25(0,O)v^l  +  c^Fr^c  -  io(0)  -  M^6] 

Now,  assuming  uniform  continuity  and  invertibility  of  E(0)  over  0  E  D,  by  Corollary  2  a  region 
V  C  D  exists  such  that  the  quantity  |o(^)  converges  to  zero  imiformly  over  0eV.  Hence,  since 

25(0,O)v/r  +  c^F,  >  0,  a  sufficiently  small  positive  d  exists  which  is  independent  of  0,  such  that 
MSE0_(0i)  <  5(0,0),  V0  €  r  (134) 


Therefore,  0i  is  a  superefficient  estimator  over  V. 

The  condition  that  the  variance  of  0i  achieves  the  variance  bound  5(0,0)  everywhere  in  5  is 
unnecessarily  restrictive.  There  are  two  necessary  conditions  for  the  achievability  of  5(0,0).  First, 
a  real  bias  function  61  (0)  must  exist  that  has  the  minimizing  vector  d^„(0)  as  its  gradient  over  D. 
If  d^„  is  the  gradient  of  61,  the  gradient  is  equal  to  the  Hessian  matrix  of  61  (0),  which 

is  always  symmetric;  therefore,  the  first  necessary  condition  requires  that  Ved^n  b®  a  symmetric 
matrix.  Second,  since  5(0,0)  is  a  CR-type  bound,  the  sufficient  condition  for  equality  in  the  Schwarz 
inequality  underlying  the  derivation  of  the  CR  bound  must  be  satisfied.  This  latter  condition  can 
only  be  satisfied  in  the  case  of  parameter  estimation  for  exponential  fiimilies  of  distributions  ([11], 
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Theorem  1,  [6],  Chapter  1,  Section  7).  As  these  two  necessary  conditions  caimot  always  be  satisfied, 
the  variance  bound  B{6,6)  is  generally  not  achievable  over  any  region  D. 

If  for  some  point  6°  an  efficient  estimator  exists  for  the  vector  0  over  an  open  neighborhood 
of  0°,  then  an  essentially  imbiased  locally  superefficient  estimator  can  be  constructed.  0i  is  called 
an  essenticilly  imbiased  locally  superefficient  estimator  at  the  point  0°  if  6i  is  essentially  unbiased 
and  if  MSEgo(fii)  <  B{0°,Q)  while  MSE6{0\)  <  B(0,O)  for  0  over  an  open  neighborhood  of  0°. 
Local  superefficiency  is  a  weaker  property  than  global  superefficiency  because  a  locally  superefficient 
estimator  only  requires  superefficient  performance  in  the  neighborhood  of  a  particular  point  0°  and 
it  may  have  MSE  which  exceeds  B(d,0)  outside  this  neighborhood. 

Let  0°  be  some  fixed  point  in  0  and  let  a  6  >  0  be  specified.  As  in  Corollary  2,  assume 
that  F{0")  is  invertible  and  that  F{0)  is  uniformly  continuous  over  an  open  neighborhood  U  of  0°. 
This  assumption  implies  that  an  open  neighborhood  V  C  U  of  0°  exists  such  that  F~^(0)  exists 

-e/f 

and  is  uniformly  continuous  over  V.  Now  assume  that  6  is  an  efficient  estimator  for  6  over  the 
■«// 

neighborhood  V,  i.e.,  0  is  unbiased  over  V  and 

Ee[{0"^^  -  0){0"^^  -  0f}  =  F-\0),  0eV  .  (135) 

The  clmm  is  that  the  following  estimator  is  essentially  unbiased,  locally  superefficient  in  an  open 
neighborhood  of  0^: 

^  +  [ei  +  -  r)  (136) 

where  d,run(^)  is  the  vector  given  in  Theorem  1.  Because  Ee\0  ]  =  0,  we  have  for  the  bias  of  0i 

6i(0)  =  £^[01 -0i]  (137) 

=  E^[0?  +  [ei+d™„(r)F(r''-r)-0i] 

=  ie,  +  d^„(r)r(0-r)-(0i-0?) 

=  [ei+d™„(r)r(0-r)-er(0-r) 

=  dl,j0n{e-0°),  wev 

Hence,  the  bias  gradient  V6i(0‘’)  is  simply  d^„(0°),  which  ly  Theorem  1  satisfies  the  constraint 
^nindmin  ^  ^  th®  estimator  01  (136)  is  essentially  unbiased  over  @  G  V.  Flmthermore,  the 
MSE  of  §1  is  equed  to 

MSEeJBi)  =  Eg[{0,-0xf] 
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=  Eem  +  [si + d™„(rr  -  n  -  hf] 

=  EeJle,  +  -e)  +  (si  +  -  r]  -  ejie  -  e°]f] 

=  £^[(1^1 + d™„r  r  0  +  dLnr  )[@  -  ri)'] 

:=  [ei  +  d_(r)]^r -'(^)[e,  +  d™„r)]  +  (dL„r)[»  -  r\? 

=  [ei  +  d™„(rrF-i(0ei+d™„(nj  +  ^^(^),  W&V  (138) 


Define  the  function  e{6) 

e{e)  MSEeiOi)  -  B{e,Q)  (139) 

=  [ei  +  d^„(r)fF-^m[e,  +  d^jn] -elF-nOk^  +  l^m  . 

Note  that  when  6  =  0°,  MSEe«(6i)  =  B{0°,6)  so  that  e(0°)  <  0  for  all  positive  6.  FVirthermore, 
e(0)  is  uniformly  continuous  over  0  H  V  because  F~^{0)  is  uniformly  continuous  over  0  E  V  and 
6i(0)  (137)  is  linear.  Hence,  for  any  7  >  0  and  any  0EV  an  e  =  6(7)  independent  of  0  exists  such 
that 

\e(0)  -  e{0°)\  <  7  (140) 

whenever  \\0  -  0°\\  <  e(7).  Relation  (140)  implies 
e{0)  <  e{0°)  +  7 

As  e(0°)  <  0  a  7  =  7  exists  sufficiently  small  so  that  e{0)  <  0  for  all  0  in  the  neighborhood  O  — 
||0  -  0°\\  <  €(7  )•  Consequently,  MSEe{0i)  <  B{0,O)  for  all  0  €  O  and  0i  is  locally  superefficient. 

7.2  Properties  of  the  New  Bound 

The  behavior  of  the  new  bound  will  be  treated  by  listing  some  of  its  properties.  For  conve¬ 
nience  we  repeat  equation  (107)  for  the  slope  of  the  difference  AB(0,^): 

=  2^1  +  drFrh  ,  (141) 


where 
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(142) 


B(0,O)  -  B(e,6) 

B(m 


Properties: 


1.  The  slope  (141)  is  positive  (because  F~^  is  positive-definite),  as  expected  because 
B{6,6)  is  less  than  the  unbiased  CR  bound  B{6,0).  More  significant  is  the  fact  that 
to  o{6)  approximation  the  relation  between  B{0,6)  and  B{6,0)  is  multiplicative  as 

a  function  of  6:  B{6,6)  ~  B(0,O)(1  —  2^^/!  +  (recall  Theorem  4).  This 

provides  indirect  evidence  for  a  potential  for  severe  bias  sensitivity  of  the  CR  bound. 

2.  The  slope  (141)  of  AB(9,6)  is  characterized  by  the  length  of  the  vector  F~^c,  a 
quantity  which  has  been  identified  in  Theorem  3  as  the  sensitivity  index  tj, 

\\F^^c\\  =  ^/c^'FT^c  .  (143) 

Tj  is  a  dimensionless  quantity  which  measures  the  inherent  sensitivity  of  the  unbiased 
CR  bound  to  bias  in  the  sense  that  -2^1  +  is  the  per  S  decrease  of  B(ff,6) 
relative  to  the  unbiased  CR  bound  fi(@,0).  A  useful  form  for  77  is  pven  in  terms  of 
the  elements  of  the  inverse  Fisher  matrix  F~^  (70)  and  (72) 


3.  The  quantity  tj  (143)  can  be  interpreted  in  terms  of  the  “joint  estimability”  of  the 
set  of  parameters  6  =  (0i,...,0„)^  by  recalling  the  CR  matrix  inequality  (23)  for 
the  covariance  of  an  unbiased  vector  estimator  6  =  {9i,. . .  ,6n)^ , 


CO\g{i)  > 


^-1 


1  ^n2 


This  gives  a  correspondence  between  7)  and  the  covariance  between  a  set  of  optimzd 
unbiased  estimators,  i.e.,  an  efficient  estimator  i  =  1, . . .  ,n. 


<^1 


where  pij  is  the  correlation  coefficient  between  and 
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(147) 


In  light  of  (146),  the  set  of  important  but  difficult-to-estimate  parameters  is  of  par¬ 
ticular  significance;  these  are  information-coupled  with  0i  (pij  ^  0)  but  their  best 
unbiased  estimators  have  large  variance  (  var{6^^^ )  s:  »  0).  t]  (143)  is  a  measure 

of  the  propensity  of  these  parameters  to  influence  the  estimator  of  6i. 

4.  Note  that  the  slope  (141)  of  is  minimized  for  the  case  ^ F~^c  =  0,  as 


dAB(t^) 

dS 


|6=o  =  2\/i  +r7'Fr-c  >  2 


(148) 


Therefore,  B{6.S)  is  tlie  most  stable  with  respect  to  bias  when  there  is  no  information 
coupling  between  6i  and  the  other  parameters  so  that  c  =  U  and  t]  =  0.  Furthermore, 
if  =  c^F~^c  =  0.  F~^c  must  be  the  zero  vector  and,  from  (85),  the  minimizing 
bias  gradient  lia,s  the  form 

=  . 0]^  .  (149) 

From  the  form  of  the  gradient  (149)  the  "best”  biased  estimator  has  mean 

=  (1 -t- const  ,  (150) 

This  mean  can  be  obtained  by  a  simple  “shrinkage”  of  an  efficient  estimator  ,  if 
such  exists; 

Oi  =  +  .  (151) 

and  as  discussed  in  Section  7.1,  is  locally  superefficient  in  a  neighborhood  of 
5.  Let  the  Fislier  information  be  such  that  the  /th  clement  of  dominates  the 

other  elements:  F~'  s;  [0 . 0,C.0 . 0],  where  as  in  (146)  C  P\xO,loi.  This 

can  occur  if  9,  is  a  coupled  but  extremely  liard-to-estimate  parameter.  Then,  the 
sensitivity  index  is  given  by 

v^  =  cJf;\_^(^  .  (152) 

and,  under  the  additional  a.ssumption  >>  1,  the  minimizing  bias  gradient  (122) 
takes  the  approximate  form 

f/„..n=[0 . 0.FO . 0]^  .  (153) 

i.e.. 


bi(9)  =  fid,  +  const  .  (154) 

In  this  ca.se,  a  small  bias  due  to  information  coupling  between  9i  and  6,  can  give  a 
substantial  decrease  in  the  general  CR  bound. 
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7.3  Practical  Implementation  of  the  Bound 

FVom  (146),  the  sensitivity  index  t]  is  apparently  dependent  on  the  units  used  to  represent 
the  parameters  02i  -  •  •  i^n-  For  example,  if  is  represented  in  fjx&ds  and  0^  is  represented  in  MW, 
the  quantity  a^lox  in  (146)  will  be  much  smaller  than  if  is  expressed  in  rads  and  62  is  expressed 
in  /iW.  Thus,  a  paxcimeter  may  be  rendered  difficult  to  estimate  solely  by  virtue  of  the  choice  of 
imits  used  to  represent  the  parameter.  The  choice  of  units  is  equivalent  to  the  specification  of  a 
coordinate  ^stem  to  represent  the  parameters  (see  Section  3).  This  unit  dependency  is  due  to  the 
feet  that  when  taken  alone  the  constraint  on  the  bias  gradient  is  not  tied  to  a  particular  choice 
of  coordinates  for  0  and  therefore  does  not  adequately  describe  a  user  constraint  on  the  bias. 
In  order  to  use  the  results  of  this  report,  the  coordinates  for  0  must  be  specified,  e.g.,  through 
the  specification  of  a  constraint  ellipsoid  (11)  m  0  over  which  the  bias  bi(0')  is  allowed  to  vary 
Ity  at  most  7.  This  bias  constraint  (11)  will  naturally  reflect  the  user’s  dioice  of  units  for  the 
parameters.  In  Section  3  we  transformed  the  user’s  units  to  make  the  constraint  ellipsoid  a  sphere 
(12).  The  results  derived  in  Sections  5  and  6  are  valid  in  these  tramsformed  coordinates.  The 
reason  that  we  chose  to  work  with  spherical  constraint  regions  is  that  for  a  nonspherical  region 
we  could  not  have  simultaneously  achieved  properties  1.  and  2.  of  Proposition  1  for  our  form  of 
bias-gradient  constraint  (10).  Thus,  expression  of  the  bias  gradient  in  a  transformation-induced 
spherical-constraint  region  guarantees  maximal  bound  reduction,  i.e.,  the  greatest  fi'eedom  on  the 
value  of  Vmi(g)  subject  to  the  requirement  (11),  as  compared  with  any  other  choice  of  coordinates. 
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8.  APPLICATIONS 


In  this  scctioi  we  apply  tlic  previous  results  to  estimation  of  standard  deviation  and  esti¬ 
mation  of  the  correlation  coefTicient,  based  on  measurements  of  a  correlated  pair  of  IID  zero-mean 
Gaussian  random  sequences.  Analytical  expressions  are  given  for  the  unbiased  CR  bound  B(^, 0) 
the  sensitivity  index  rj,  and  the  approximation  dm,n  Theorem  4  for  both  of  these  estimation 
problems.  Then,  for  the  estimation  of  standard  deviation,  the  exact  bound  B(6,6)  (56)  of  Theorem 
1  is  numerically  evaluated  and  its  behavior  is  compared  to  the  behavior  of  the  sensitivity  index. 

Consider  the  following  covariatice  estimation  ]>roblcm.  variable  for  observation  tire  a  pair  of 
random  sc^ac^ccs 


{A-2.}ri,  =  .  (155) 

where  {[A’l,.  are  IID  Gaussian  random  vectors  with  mean  zero  and  unknown  covariance 

matrix 


fTj’  <Tl2 

<t,2  (T~2 


(156) 


Define  the  correlation  coeflicicnt 


f>  =  - 

aiCT2 


(157) 


where  ctj  and  <j-2  are  the  po.sitive  s<|uare  roots  of  aj  and  a\.  respectively. 


8.1  Estimation  of  Standard  Deviation 


The  objective  here  is  to  estimate  the  standard  deviation  a\  of  .Vi,  under  the  cissumption  that 
02  and  p  are  unknown.  The  likelihootl  function  corresponding  to  this  estimation  problem  has  the 
form  (A.‘2)  from  .•Xpirendix  .\ 


ni 


i(oi.02,p)  -  -  — ln((2;r)*c7f<T2[I  - /))  -  - 

2  2(  1  -  p^) 


of  0j02  o4 
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(158) 


where  in  (158)  Xf  is  the  sample  variance  of  and  X1X2  is  the  sample  covariance  between 

{A'ii}  and  {Ai2}  for  known  mean  zero.  The  Fisher  information  matrix  for  cti,  02)  and  p  is  derived 
in  Appendix  A,  Equation  (A.ll) 


F  = 


m 


l-p2 


¥ 

(Ti<r2 

_ e 

Hi 

<7l  <72 

¥ 

_ c. 

(159) 

_ e. 

_J2. 

<72 

i±d 

J 

identify 

the  parameter 

vector  0:  0i  oj. 

02  "=  02,  and  03  '=  p.  The 

unbiased  CR  bound  B(0,0)  (30),  the  sensitivity  index  t)  (143),  and  the  0(6)  approximation  d* 
(122)  are  derived  in  Appendix  A,  Equations  (A. 24),  (A.25),  and  (A. 26),  respectively. 


S(4.0)  =  ± 


(160) 


(161) 


iT 


-1, - P  , - (1  -  P  ) 

O’!  <^l 


(162) 


The  following  comments  are  of  interest: 

1.  The  CR  bound  S(d,0)  (160)  on  the  variance  of  an  unbiased  estimator  di  is  function¬ 
ally  independent  of  the  variance  of  X2i  and  of  the  correlation  p  between  Xu  and 
X2i- 

2.  If  p  =  0,  then  r/  =  0,  and  the  general  CR  bound  is  minimEdly  sensitive  to  bias-gradient 
length  6. 

3.  The  squared  sensitivity  (161)  is  the  convex  combination  of  two  terms:  ^p^  and 

^  Since  if  is  inversely  proportional  to  of,  the  sensitivity  of  the  CR  bound  to 
small  bias  can  be  significant  when  of  is  small. 

4.  For  ff  close  to  0,  the  ratio  of  the  first  and  second  terms  of  (161)  is  approximately 
ofp^.  In  this  case,  if  of  ^  ^  the  first  term  dominates  rf,  while  if  of  ^  the 
second  term  dominates.  For  p^  close  to  1,  the  ratio  of  the  first  to  second  term  is 
approximately  of /(I  -  p^)^.  In  this  case,  if  of  »  (1  -  p^)^  the  first  term  dominates, 
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while  if  02  second  term  dominates.  In  any  case,  it  is  seen  that  02  »  1 

brings  about  a  much  higher  sensitivity  index  than  02  1,  particularly  for  2>  0. 

A  related  problem  is  the  estimation  of  the  correlation  coefficient  p  for  the  above  pcur  of 
Gaussian  measurements. 


8.2  Estimation  of  Correlation  Coefficient 

Consider  the  estimation  of  6\  p,  when  62  *=  and  63  02  are  unknown.  For  this  case 

we  have,  from  Appendix  A,  Equations  (A. 39),  (A.40),  and  (A. 41),  respectively: 


5(P,0)=  i(l  -p2)2 
m 


2  _  of  +of  fp 


2  2(l-p2)2 


(163) 

(164) 


J**  _ 

— rmn  r. - 57 

vG  +  r 


-1,  ~ 


2(l-p2)‘"‘’  2(1 -p2) 


02 


(165) 


We  make  the  following  comments  concerning  the  problem  of  estimation  of  p: 

1.  The  CR  bound  B{6,0)  (163)  on  the  variance  of  an  unbiased  estimator  p  has  a  form 
which  is  independent  of  the  standard  deviations  oi  and  02.  Hence,  the  MSE  perfor¬ 
mance  of  an  unbiased  efficient  estimator  of  p  is  also  invariant  to  these  quantities. 

2.  As  occurs  in  estimation  of  oi,  the  case  p  =  0  corresponds  to  a  CR  bound  which  is 
minimally  sensitive  to  small  bicises. 

3.  The  form  of  the  sensitivity  index  t]  (164)  indicates  that  the  CR  bound  is  sensitive  to 
small  biases  if  the  average  (of  -|-  ct2)/2  is  large  and  if  there  is  significamt  correlation 
\p\  »  0.  This,  along  with  the  form  of  (165),  suggests  that  substaintial  improve¬ 
ment  in  estimator  variance  may  be  possible  1^  using  an  estimator  whose  bias  is  not 
invariant  to  the  standard  deviations  oi  and  02. 

8.3  Numerical  Evaluations 

Surfaces  for  the  normalized  difference  AB{6,6)  (55)  and  the  sensitivity  index  77  were  generated 
numerically  for  the  variance  estimation  problem  of  Section  8.1.  The  matrix  F  (159)  was  input  to  a 
computer  program  which  computes  AB(6,6)  and  7?  as  the  peu-ameter  vector  6  =  ((t\,(T2,p)  ranges 
over  the  set  {1}  x  [0, 1000)  x  [-1, 1).  For  this  example,  6  was  set  to  0.001  and  m  =  1.  In  Figiue  5 
the  quantity  AB{6,6)  is  plotted  over  (<72, p)  and  in  Figure  6  the  sensitivity  index  is  plotted  over 
the  same  range  of  (02, p).  A  comparison  between  Figures  5  and  6  supports  the  approximate  small 
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152077  5 


AB  (6,5) 


Figure  5-  The  normalized  difference  AB(8,6)  jdotted  as  a  function  of  az  and  p  for  RMS 
power  estimation.  Increasing  values  of  AB{0,6)  correspond  to  increased  sensitivity  of  the 
CR  bound  to  small  bias.  Surface  is  plotted  for  6  =  0.001  and  Ci  =  1.0. 


6  analysis  in  Section  8.1,  which  was  based  on  an  investigation  of  the  sensitivity  index.  Note  that 
the  region  of  p  for  which  the  bound  B{6,6)  differs  significantly  fi’om  the  unbiased  bound  becomes 
increasingly  large  as  the  standard  deviation  <73  of  X2i  increases.  Rirther,  when  p  =  0  then  t;  =  0  in 
(161),  and  we  see  minimal  bias  sensitivity.  In  the  limit  as  <73  00  the  surface  AB(6,6)  (Figure  5) 

becomes  a  deep  wedge  centered  along  the  line  p  =  0;  substantial  bound  reduction  is  achieved  for 
cill  nonzero  p.  Figure  5  indicates  that  for  small  a2  the  surface  AB(9,6)  is  close  to  zero  for  all  p. 
A  simple  calculation  shows  that  p^(l  —  attains  its  maximum  for  p^  =  1/3.  Hence,  in  view 
of  the  expression  (161),  rp  <  ctj/cti  +  4/(27<7j  ).  Thus,  for  the  present  example  where  (7i  =  1.0, 
if  <  02  +4/27.  Recalling  that  AB(9,6)  %  6  j  =  2^v^l  +  rf  (107),  this  implies  that  little 

bound  reduction  occurs  for  small  (73  and  6  =  0.001. 
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Figure  6.  The  bias-sensitimty  index  t)  plotted  over  the  same  range  of  the  parameters  as 
m  Figure  5. 
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9.  CONCLUSION 


A  new  lower  bound  on  estimator  variance  for  almost  imbiased  estimators  in  the  presence 
of  nuisance  parameters  has  been  derived  as  a  function  of  the  Fisher  information  matrix  and  the 
imbiased  CR  bound.  The  new  bound  has  the  form  (56)  which  involves  finding  the  positive  root  A  of 
a  convex  function  (57).  The  bound  (56)  is  valid  over  all  estimators  whose  bias  gradient  has  length 
less  than  or  equal  to  a  prespecified  constant  6.  It  reduces  to  the  staindard  CR  bound  on  unbiased 
estimators  for  6  =  0.  The  sensitivity  of  the  general  CR  bound  to  small  bias  has  been  characterized 
by  the  slope  of  the  normalized  difference  between  the  new  bound  and  the  CR  bound.  This  slope  is 
monotone  in  a  sensitivity  index  rj  (108).  The  form  of  the  sensitivity  index  indicates  that  the  new 
bound  is  significantly  less  than  the  unbiased  CR  bound  when  important  but  difficult-to-estimate 
nuisance  parameters  exist.  This  implies  that  the  application  of  the  CR  bound  is  unreliable  for  this 
situation  due  to  severe  bias  sensitivity. 

We  obtained  numerical  and  analytical  results  for  the  problem  of  estimation  of  the  standard 
deviations  and  the  correlation  coefficient  for  a  pair  of  IID  zero-mean  Gaussian  sequences.  For 
estimation  of  the  standiird  deviation  of  the  first  sequence  it  was  shown  that  a  small  estimator 
bias  can  significantly  affect  the  CR  bound  when  the  variance  of  the  second  sequence  dominates 
the  variance  of  the  first  and  the  correlation  coefficient  of  the  two  sequences  has  high  magnitude. 
For  correlation  estimation,  it  was  shown  that  a  small  estimator  bias  can  significantly  affect  the 
CR  bound  when  the  average  of  the  two  variances  is  high  and  the  correlation  coefficient  has  large 
magnitude. 
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APPENDIX 


Here  the  Fisher  matrix  F  for  the  estimation  problem  of  Section  8  is  derived  and  the  quantities 
B(0,0),  Tj  and  of  Theorem  3  and  Theorem  4  are  calculated. 

A.l  Fisher  Information  Matrix  for  Estimation  of  2  x  2  Covariance  Matrix 

Since  Xj  =  [Xii  ,Xi2]^  are  HD  bivariate  Gaussian  random  vectors  with  mean  0  and  covariance 
A 


(A.l) 


the  log-likelihood  function  for  a\,  and  p  =  o\il{a\a2\  given  the  observations  Xi,  ..  ,Xmi  ^ 
^(<ri,a2,p)  =  ln/(Xi,. . .  ,X„;<Ti,a2,p)  (A.2) 

=  ^  g  (f  -  .  f ) 


where  Xj,  X\  and  X'iX2  are  the  sample  variances  and  the  sample  covariaince  associated  with  {Xa} 
and  {Xi2},  respectively. 

Next,  the  elements  of  the  Fisher  matrix  F  (22)  are  computed; 


_  a»/ 

d(T\  ^ 

S(T2 

dir\dp 

0^1 

dC2d(7\ 

di72^ 

d<72dp 

dpda\ 

~W 

Using  (A.2),  the  partial  derivatives  in  F  are  amply  computed.  The  results  are 


_  _  p  X1X2 

daxda2  ^\-p^ 


(A.4) 

(A.5) 
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I 
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where  the  remaining  terms  are  omitted  due  to  symmetry  of  F.  As  the  sample  averages  in  (A.4)- 
(A.9)  are  unbiased, 

£1^1  =  I 

I  =  ‘ 

=  P  ;  (A.io) 

(7i<T2 

the  elements  of  F  are  easily  computed  by  taking  the  expectation  over  the  quantities  in  (A.4)-(A.9). 
The  results  are 


Fn  = 
F\2  = 

Fi3  = 
F21  = 

F22  = 

F23  = 


m  2  — 

1  -  p2 

El 

1  ^  i-P^) 

da\ da2 

1  —  ai<T2 

B[-  1 

‘  dcTxdp^ 

m  (-p) 

1  -p2  <7i 

Fi2 

II 

m  2  —  fp 

1  -p2 

m  (-p) 

1  -  p2  <72 
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^31 

f’32 


Fi3 

F23 


(A.ll) 


^33 


‘  V  l-p2l_p2 


A.2  Bound  Derivations  for  Estimation  of  2  x  2  Ck)variance  Matrix 

Next,  the  sensitivity  index  7/  and  the  approximation  to  the  minimizing  bias-gradient  vector 
d”„  are  derived  for  estimation  of  the  standard  deviation,  6\  =  oi,  and  for  correlation  estimation, 
6\  =  p,  respectively. 


Elstimation  of  Standard  Deviation 


In  the  case  where  cti  is  of  interest,  identify  the  partition 


c  F,  J 

i.e.,  a  =  Fn,  c  =  |Fi2,Fi3]^,  and 


(A.12) 


Fs  = 


F22  F23 

Fz2  F33 


(A.13) 


Specifically,  in  terms  of  (A.ll), 


m  2 

a  = 

c  = 

m 

fL]T 

* 

1 

(Ti<72 

o\ 

Fs  = 

m 

ZR 

02 

l-p2 

z£ 

L  02 

Only  the  inverse  F,  ^  is  needed  to  compute  expressions  for  rj  (143)  and  (122): 
rp  =c^F-^c=\\Fr^cf  , 


(A.14) 

(A.15) 

(A.16) 


(A.17) 


59 


T 


r 


T  ir-iiT’ 


^'l  +  cT'Fr'^c 

Using  Cramer’s  rule,  the  inverse  of  F,  is,  from  (A. 16), 

5?^ 


(A.18) 


r’-l 


1  m 


L  "2 


£. 

<72 


^  \Fs\l-p^ 
where  jF^j  is  the  determinant  of  F>; 

Now,  in  view  of  (A.  15)  and  (A. 19),  the  vector  ^F~^  is 
c^f:'-  = 


(A.19) 


(A.20) 


m 

p^ 

1  m 

[1^ 

5h> 

- .,  J 

i-p2 

(7i<72  <7i 

\Fs\l-p^ 

A 

.  <72 

1  /  rn  y  r  ;P^(1  +p^)  (?  f?  p(2-  p^) 

\F,\\l-f!^)  <Ti<T2{\  -  f^)  <71(72’  (71  erf  CTitTf 


_ l__  /  m  Y  1 

|f’»l  \l  -  P^J  <7lC^ 


2  7n 

l-fp'  ^ 


(A.21) 


Use  the  expression  (A.20)  for  |Fg|  in  (A.21)  to  obtain 

/F-‘  =  [(72P^  p(l  -  p^)] 

Now  the  quantity  F~^c  is  simply 

c^F;'^c  =  [<72P^,  p(1  -p^)] 


(A.22) 


_p2 

<F\<T2 

z£ 

Cl  J 


m 


l-p2 


m 
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m  f? 
I-  fP 


(A.23) 


Using  (A.23),  the  unbiased  CR  bound  B(0,O)  is  next  derived,  ftom  (30),  the  definition 
(A. 12),  cind  the  identity  (2), 


B{e,0)  =  e^F-^e,= 


a  -  c^F. 


m  2—c^  m 

i-p^  <4  i-p^  ^ 


The  sensitivity  index  (A. 17)  is  given  by  the  squared  magnitude  of  (A. 22) 


(A.24) 


7,2  =  ^-20=11^-1112 

=  P(1  - 


r^)»- 


(A.25) 


Finally,  using  (A. 22)  in  (A. 18),  we  have  the  following  expression  for 


V  ,  _2  P  It  2\ 

j,  -1' - - (1  -  P  ) 

\/l  +  7,2  I 


(A.26) 


Elstimation  of  Cktrrelation 


In  the  case  where  p  is  of  interest,  identify  the  partition 


F/ 

oF 


(A.27) 


i.e.,  oF  =  F33,  =  [Fi3,F23]2',  and 
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Fl^  = 


(A.28) 


Fn  Fi2 
F2J  F22 


Specifically,  in  terms  of  the  identities  (A.ll) 


1-/^1 

cf  = 

m 

-P  -P 

I-P^I 

(Ti  (T2 

II 

m 

1'^ 

<ria2 

<Ti<T2 


(A.29) 

(A.30) 

(A.31) 


Only  the  inverse  [F/j  *  is  needed  to  compute  expressions  (143)  for  77,  (122)  for  and  (30)  for 
5(0,0)  (Note:  we  use  a  reversal  of  the  previous  parameter  ordering); 


=  [c^l"’[Fr]-V 

6 


_ 

=rmn 


Using  Cramer’s  rule,  the  inverse  of  [F/]  is,  fi-om  (A.31), 

2-p^ 


(A.32) 

(A.33) 


[F/j-* 


1  m 


<75  <71(72 

<7i<72  art 


(A.34) 


where  IF/I  is  the  determinant  of  F/; 

As  a  final  form,  combining  (A.34)  and  (A. 35)  we  obtain 


(A.35) 


4m 


a\(2-f?)  p^<Txa2 
Ol<T2p^  0^(2-^) 


(A.36) 
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Now,  in  view  of  (A.30)  and  (A.36),  the  vector  \cPf{FP]  ^  has  the  form 

\  —  (T  \_ai  02 


(^{2-f?)  P^0i02 

f?0i02  (^{2-(?) 


1 

47n 


4(1  -p2) 

-P 

2(1 -p2) 

Now  the  quantity  [c'’]^[F/’l~^c^  is  simply 

[,P]T[^P]-V  ^ 


[-'7i/>(2  -  P^)  -  p^Oi,  -02p{2  -  P^)  -  p^02 

l0\,02] 


2(1  -^>'‘’■■'’^1 

zB 

1  J 

2 

mp^ 

(l-p2)2 

m 


1-^ 


Using  (A. 38),  the  unbiased  CR  bound  8(6,0)  is  next  derived.  FVom  (30),  the 
(A. 27),  and  an  identity  analogous  to  (2); 


8(6,0)  =  e^F-%- 


aP  -  [c^]’l’(Fi’]~*c^ 


m  l+fP  _  mfP 


771 


The  sensitivity  index  if  (A.32)  is  given  by  the  squared  magnitude  of  (A.37) 


[cYiF/rMP 


4(l-p2)2 

of  +  (jf  f 

2  2(1  -p2)2 


(A.37) 

(A.38) 

definition 


(A.39) 

(A.40) 


63 


64 


REFERENCES 


1.  P.  Stoic  .  and  R.L.  Moses,  “On  biased  estimators  and  the  Cramer- Rao  lower  bound,”  Signal 
Processing,  Vol.  21,  349-351  (1990). 

2.  F.A.  Graybill,  Matrices  with  Applications  in  Statistics,  Belmont,  CA:  Wadsworth  (1983). 

3.  RA.  Horn  and  C.R.  Johnson,  Matrix  Analysis,  Cambridge,  UK;  Cambridge  University  Press 
(1988). 

4.  I.N.  Herstein,  Topics  in  Algebra,  New  York,  NY;  Wiley  (1975). 

5.  J.D.  Gorman  and  A.O.  Hero,  “Lower  bounds  on  parametric  estimation  with  constraints,” /FFF 
Transactions  on  Information  Theory,  Vol.  36,  No.  6,  1285-1301  (1990). 

6.  I.A.  Ibragimov  and  RZ.  Has’minskii,  Statistical  Estimation:  Asymptotic  Theory.  New  York, 
NY:  Springer- Verlag  (1981). 

7.  W.  Rudin,  Principles  of  Mathematical  Analysis,  New  York,  NY:  McGraw-Hill  (1976). 

8.  G.H.  Hardy,  J.E.  Littlewood,  and  G.  Polya,  Inequalities,  Cambridge,  LTC:  Cambridge  Univer¬ 
sity  Press  (1952). 

9.  A.  Kuruc,  “Lower  bounds  on  multiple-source  direction  finding  in  the  presence  of  direction- 
dependent  antenna-array-calibration  errors,”  Technical  Report  799,  MIT  Lincoln  Laboratory, 
Lexington,  MA  (November  1989).  DTIC  AD  A215825. 

10.  K.  Deimling,  Nonlinear  Functional  Analysis,  New  York,  NY:  Springer- Verlag  (1985). 

11.  A.V.  Fend,  “On  the  attainment  of  Cramer-Rao  and  Bhattacharyya  bounds  for  the  variance 
of  an  estimate,”  Ann.  Math.  Stat.,  Vol.  30,  381-388  (1959). 

12.  C.R.  Rao.  Linear  Statistical  Inference  and  Its  Applications.  New  York,  NY;  Wiley  (1973). 


65 


REPORT  DOCL'MENTATION  PAGE 


Form  Approved 
0MB  No.  0704-0188 


®.ChiC  'TOOnirig  DufOe'^  '0'  t»>ts  cotiectio"  o'  intormatiOfi  is  est-mai«<3  to  avera^  ’  nou'  pe'  'ssoonsa  mctuOing  tn«  time  tor  'avtawmg  msffoCtioos  saarcning  exisimg  oata  sou'ces  gatnefing  a^XJ  mamiatnirig  -ne  oata  neaoeo 
arxJ  cof^D‘at''ts  aoa  'eww^g  tue  coneciiofi  o'  .ntormapon  Seno  co'^’rnents  'aga'Oii'fl  tnis  Duroa^'  asnniata  v  any  otnar  asoaci  o*  ini«  coi'scuon  ot  'Honnation  in-ciuamg  suggestions  lof  'eoocmg  tnis  ouroen  ro  ^asningror 
•’eaOQuans's  ‘^^*'v<es  0’'ecio'3ie  'o'  intorrnation  Operation,  ano  Reports  tj'S  jetierson  Dav>s  H.Qnwd>  Suite  t204  Aningior  VA  22202  4302  ana  to  me  OHice  ot  Management  arxj  Buoget  Paperyjor*  Reouaion  P'Oiec 
C’04  0'a8'  Wasnir>g?o'-  DC  20503 


1 .  AGENCY  USE  ONLY  (leave  blank) 


2.  REPORT  DATE 
3  January  1992 


REPORT  TYPE  AND  DATES  COVERED 
T4*<‘hnu*al  Reptyrt 


4.  TITLE  AND  SUBTITLE 
.A  (-ramer-Rao  Type  Lower  Bound  for 

Flssenlially  1  nbiased  F^arameter  Estimation 

5.  FUNDING  NUMBERS 

6.  AUTHOR{S) 

\'fred  O.  Hero 

(.  —  F19628-'-(l-(.-0(MI2 
f’K  —  .S.UOIG 

PR  —  2R1 

7  PERFORMING  ORGANIZATION  NAME^S}  AND  ADDRESS(ES) 

Lincoln  Laboratorv.  MIT 
f'.O.  HoxTd 

i.exington.  M  A  02 1  T.'t-9 lOH 

8  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

TR-8<)(i 

9  SPONSORING  MONi  iORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

1  ■>.  \ir  Fdrce 

LSI)  E\KL 

Han-cnm  AEB.  M  \  0 1 73 1  aOOO 

10.  SPONSORING  MONITORING 
AGENCY  REPORT  NUMBER 

KSU-TR-'M-lOt 

1*  supplementary  NOTES 

N'Uie 

12a  DISTRIBUTION  AVAILABILITY  STATEMENT 

12b  DISTRIBUTION  CODE 

\pproved  for  public  relea-e;  di»tribution  i-  unlimited. 

^  3  ABSTRACT  ( Maximum  200  words) 

In  ihi*.  rrjmrt  a  n^’^^  ( .ramt'r*Rat»(< .  K  I  In  pt*  Itmer  hound  whirh  takrs  initiarrount  a  iis*T->p»*rifi»MJ  constraint  t>n  the  length 

irf  the  gradient  of  otimator  hia^  with  re-pert  to  the  -et  of  underlying  parameter-.  If  the  parameter  spare  is  hounded,  the  ron-traint 
•  •n  hia-  gradient  tran«ilate-  inlt>  a  ron^itrainl  tin  the  magnitude  of  the  liias  itself:  the  btiuml  reiliit'e-  to  the  standard  unbiased  form  tif 
the  (.K  hound  for  unbiased  estimation.  In  adtlition  to  its  usefulness  as  a  lower  hound  that  is  insensitive  to  small  biases  in  the  estimator, 
the  rate  tif  change  of  the  new  htiund  provule-  a  (fuantitative  bias  "sensitivity  index"  for  the  general  bias-dependent  CR  bound.  An 
analytical  form  lor  ihi-  -ensiiiviiv  index  is  derived  which  indicates  that  small  estimator  biases  can  make  the  new  bound  significantly 
lc  "'than  the  tinbia-ed  f  K  bound  when  important  but  difficuit-lo-estimate  nuisance  para  meters  exist.  This  implies  that  the  application 
of  the  ( .K  bound  i-  unreliable  for  thi-  -itiiation  due  to  severe  bias  sensitivitv .  A-  a  practical  illustration  of  these  results,  the  problem 
of  e«.tiniating  element-  of  the  2  ^  J  covariaru'e  matrix  as-ociated  with  a  pair  of  independent  identically  distributed  I  III))  zero-mean 
Uaii-oian  random  -e(picncc-  i-  pre-ented. 


14  SUBJECT  TERMS 
Lramer-Ka<i  bound- 

biased  e-timates 

e-tima!ion  theory 

covariance 

16.  PRICE  CODE 

17  SECURITY  CLASSIFICATION 

18  SECURITY  CLASSIFICATION 

19  SECURITY  CLASSIFICATION 

20.  LIMITATION  OF 

OF  REPORT 

OF  THIS  PAGE 

OF  ABSTRACT 

ABSTRACT 

1  nrlas-ifieil 

1  nclassified 

1  nclassified 

SAR 

NSN  7540-01-280-550  .- 


Standard  Form  298  (Rev.  2-89) 
Prescribed  by  AMSI  Std.  239  1 8 
298-102 


